Oh boy. I replied to the wrong post! Moving it now.

]]>Philosophical answer: Dawid P. On individual risk. Synthese 194 (9):3445-3474 (2017). arXiv:1406.5540

]]>Philosophical question: In hindsight, what _should_ their predicted probability have been?

]]>“So do pundits.”

“But models use data and express their predictions numerically, so people take them too seriously.”

“Polls express their error numerically, too. Pundits seldom admit they could be in error. Some refuse to even admit they’ve ever been wrong.”

“People know pundits’ interpretation of polls may be wrong without them acknowledging it. Everybody knows that an opinion is just an opinion.”

“Have you ever heard of Fox News?”

“Deceptive punditry is obviously harmful to democracy, but so are misunderstood model predictions.”

“So, as a political scientist, instead of criticizing media that lies to voters, you’ve chosen to criticize media that allows voters to have access to information that’s too complicated for many of them to understand? What kind of priority is that?”

“Well, the people who lie to voters pay me to conduct polls so that they’ll have something to lie about. The modelers are just free-riders.”

“Ah.”

]]>Example (ii) 9 voters. I sample 3. 2 are for Clinton, 1 for Trump. My estimated margin is 67% for Clinton. What is my standard error? (straightforward, with effort) With what probability do I think Clinton will win? (Less than 67%)

Example (ii) is simpler. One way to think about it is that each voter has today and will have on election day the same unmchanged probability of voting for Clinton, X, and I’m trying to estimate that number. My best (modal) guess for Clinton’s margin is then 9X, rounded to the nearest ninth since 60%, for example, is an impossible margin. I can also figure out the probability that if X= 55%, say, that rolling that 55% die 9 times for the 9 voters will end up with Clinton getting 5,6,7, 8, or 9 votes (since the three voters I sampled could all, in this model, change their mind on election day and vote differently from today).

But that assumes each voter has an independent probability X of voting for Clinton, which is an extreme assumption and unrealistic. It’s closer to reality to think that we’ve got Clinton voters and we’ve got Trump voters and we’re trying to figure out how many of each there are: a voter doesn’t flip a coin to decide which he is.

What is the opposite extreme? Assume that each of our sampled people has two other non-sampled people just like him. Then we can deduce that there are at least 6 people for Clinton and 3 for Trump. Since there are only 9 people total, that means that we can predict with 100% probability that Clinton will win, and that it will be by a 6-3 margin.

So is it really useful to try to arrive at a number like 84%? I like the idea of knowing how an expert would bet on whether Clinton gets at least 50%. I think I’d prefer his honest answer without his depending too much on formal analysis,though. Like how I liked Brian Leiter’s ranking of philosophy departments before he went formalistic. I think Professor Leiter (a leftwing Nietzsche scholar) is biased, but I know his biases and can correct for them, and I value his gut opinion more than what he does now, which is some sort of expert survey where he gets to pick the experts so it looks fairer to naive people.

This is related to bayesian-v-classical conundrum: how fancy do you make your model, and do you put in your subjective priors?

]]>pselphology comes from Greek and has to do with pebbles and counting therein. The word never caught on in the U.S. and I am not sure if it is still commonly used in the UK.

]]>There are advantages to browsing the web on your pselphone

]]>I remain grateful that I never did anything like promising to eat a bug if Trump won.

]]>I really like the idea of mapping forecast probabilities to sports probabilities to help intuition (at least for sports fans).

So in the last election the forecasters could have said that the probability that Clinton is elected is equal to…

– (Princeton and PollyVote): the probability of an extra point being kicked (under the old NFL rules).

– (FiveThirtyEight): the probability of a 30 yard FG being kicked.

The current Economist model could say that Biden’s probability of winning the electoral college is about the same as Kevin Durant making a free throw.

]]>Matthew said,

“I want to push back a bit on the idea that people aren’t good at reasoning under uncertainty. We reason under uncertainty all the time in everyday life, and I don’t think we’re that bad at it!”

https://ordinary-times.com/2011/03/29/what-do-you-mean-we-paleface/

]]>Paul said,

“The postmortems on the U.S. election by the pselphologists were quite varied in character and often a combination of the following:”

I’ve learned a new word!

]]>Sure, there were other predictions out there, but I and many other people considered 538 to be the most credible. I’m no election forecasting expert so I had to choose who to believe based on track record and on how much their methodology made sense to me; 538 won on both counts.

Most of the other predictions I saw were clearly over-certain. My default assumption is that any model predictions about anything will be over-certain, so any mechanistic model that doesn’t have a human inflating the error bars is likely to understate them. I remember discussing with friends the ways the forecasts could be wrong in either direction — the usual suspects being the extent to which the voting population differs from the polled population, and people favoring one candidate being more likely to answer — and the fact that this could lead to a regional or nationwide polling bias. This possibility was recognized by political commentators but I think a lot of models ignored it, Andrew’s included I think.

Anyway I was trusting 538 so I was surprised but not shocked by Trump’s victory. But among my friends I think there was some wishful thinking and a tendency to believe the forecasts that said HRC had it sewn up.

The funny thing is, though, that the friend I mentioned above did read 538 and thought of it as the single most reliable forecast!

]]>Kewl

]]>Bruce contends that Andrew et al:

“…implies a rigorous, quantitative approach that can be trusted to the final digit…”

Yet right at the head of the page, before the numbers it says:

“…Right now, our model thinks…[it is]…very likely…”

As well as the other factors you pointed out. I just don’t see that as offering unrealistic precision.

]]>Paul:

See my article with Julia Azari, 19 things we learned from the 2016 election and our follow-up, How special was 2016?.

]]>a. Even in Russian roulette there is 17 % chance of failure

b. Well, she did win the popular vote by almost three million

c. We noted that things were changing near the end

d. The FBI and KGB did it

e. Our competitors did even worse

f. Sam Wang of the Princeton Election Consortium had to eat an insect and we didn’t

Akin to the Big Bang of 13.8 billion years ago, some unexpected, momentous things just happen and they are hard to explain.

]]>Only semi-joking:

https://www.smbc-comics.com/comic/prediction

Given that (in the US, anyway) the length of a (US) football field is a common unit for describing really big things, maybe people would respond to a distinction between an any-given-Sunday upset and a Super-Bowl-III surprise.

]]>This guy is giving me some serious Harry Enten vibes.. all salesman, not much substance.

]]>Yes, you do. You show two probabilities and the CI.

I’m suggesting showing the CI *instead* of the probabilities. Just show the range of outcomes in creative ways, forget probabilities as a way to communicate.

]]>Carlos:

Oh no! Our probabilities don’t add up to 100%. We better fix this.

]]>“Most historians just study the past. But Allan Lichtman has successfully predicted the future.”

Ugh!

]]>> Following Andrew’s links I don’t see where you get just a plain “87%”.

The Economist page is this one: https://projects.economist.com/us-2020-forecast/president

Before going into details it presents this summary:

“Right now, our model thinks Joe Biden is very likely to beat Donald Trump in the electoral college.

Joe Biden – Democrat

Chance of winning the electoral college: around 9 in 10 or 90%

Chance of winning the most votes: better than 19 in 20 or 98%

Predicted range of electoral college votes (270 to win): 220-434

Donald Trump – Republican

Chance of winning the electoral college:around 1 in 10 or 10%

Chance of winning the most votes: less than 1 in 20 or 3%

Predicted range of electoral college votes (270 to win): 104-318”

“reporting a probability like 87% chance of a Biden win conveys a false sense about the precision of the approach. ”

Following Andrew’s links I don’t see where you get just a plain “87%”. His 12-June post has two charts both of which have huge shaded areas clearly indicating the range of possibilities.

I’m sure the public doesn’t understand exactly what errors “95% confidence interval” means, but I’m even more sure people understand that forecasts and predictions are inherently uncertain.

Alot of the public doesn’t even understand what “percent” means. No one is ever going to convey a detailed knowledge of uncertainty to those people. Yet they can still have intuitive knowledge that predictions are uncertain and the media is so full of wrong and wacky predictions it would be surprising if they weren’t aware that election forecasts are highly uncertain.

]]>Joshua:

See here. Short answer is that by trying to predict win/loss rather than vote share, you’re throwing away a lot of information.

]]>I’m with Caleb. “Sabato’s Crystal Ball” sounds like the musings of some guy. Rating things as “Safe”, “Likely”, “Lean”, or “Toss-up” conveys that the ratings are qualitative. The nebulousness of what “Likely” actually means is a feature because it gives the reader a sense that the predictions are imprecise.

In contrast, the header of your model on the Economist website says “The Economist is analysing polling, economic and demographic data to predict America’s elections in 2020”. This implies a rigorous, quantitative approach that can be trusted to the final digit reported. Especially for people who are not familiar with all the assumptions that go into this sort of modeling, reporting a probability like 87% chance of a Biden win conveys a false sense about the precision of the approach. I don’t think you would say that you are certain the actual probability is 87% and not 84% or 90%, but the way you present things indicates this to the general public (and I’ve seen far worse examples from other forecasts where they report things like 87.23% which is plainly a ridiculous level of precision).

I think you could revise the way you present your model results to better convey the inherent uncertainty in this sort of forecast.

]]>It can both be true that (a) people misunderstand probabilities and (b) forecasters are overconfident.

Here are all the final 2016 US election forecasts from prominent models I could find. They are ordered from most to least confident.

Probability of Clinton Win

The Princeton Election Consortium: 99%

PollyVote, DeSart and Holbrook: 99%

The Huffington Post, Natalie Jackson: 98%

PredictWise, David Rothschild: 93%

Daily Kos: 92%

Slate, Pierre-Antoine Kremp and Andrew Gelman (based on Drew Linzer model): 90%

New York Times, The Upshot: 85%

Elliott Morris: 84%

PollSavvy: 82%

FiveThirtyEight, Nate Silver et al.: 72% or 71% (polls-plus or polls-only forecasts)

How much forecaster overconfidence was there in 2016? Well, that depends on how much of an upset you think Trump’s win was. But regardless, I think it’s useful evidence to look at the full set of 2016 forecasts. As Daniel and Chris note, many were much more confident than FiveThirtyEight

]]>Also this, FWIW, where Lichtman responds to Sikver’s critique:

https://fivethirtyeight.com/features/keys-to-the-white-house-historian-responds/

]]>In the Times as well: https://www.nytimes.com/2020/08/05/opinion/2020-election-prediction-allan-lichtman.html

]]>Maybe you and Andrew are both right.

They claim to be “modest” but just looking through the site they don’t seem to be mocking themselves too loudly, so the idea that they’re not intending it to be taken seriously seems like a stretch. They’re playing both sides of the bet: selling the “Crystal Ball” while claiming it doesn’t mean anything.

]]>I mean I get that there’s a lot of subjectivity in Lichtman’s method – but how would you describe the differences in that subjectivity from the subjective judgements made in other methods?

Nate Silver has pointed out that Lichtman predicted that Trump would win the popular vote.

Also:

https://fivethirtyeight.com/features/despite-keys-obama-is-no-lock/

]]>“We think that we know about uncertainty, and that when we have added a standard error or a confidence interval to a point estimate we have increased knowledge in some way or other. To many people, it does not look like that; they think that we are taking away their certainties—we are actually taking away information, and, if that is all that we can do, we are of no use to them.

This was brought home to me forcibly when Peter Moore and I appeared before the Employment Select Committee of the House of Commons—which is not a random sample of the population at large. Our insistence that we could not deliver certainties was regarded as a sign of weakness, if not downright incompetence. One may laugh at that, but that is the way it was—and that is what we are up against. ” (Bartholomew, 1986, p. 428, JRSS:A)

Set your dad up on Twitch!

]]>At minimum all forecasters should express the uncertainty in appropriate and honest and accurate mathematical terms, with appropriate and honest caveats. They don’t have a responsibility to make the public understand probability. Everyone knows forecasts have error.

]]>Andrew –

Could you elaborate on why you’re so disdainful of Lichtman’s prediction method?

-snip-

Retrospectively, the keys model accounts for the outcome of every American presidential election since 1860, much longer than any other prediction system. Prospectively, the Keys to the White House has correctly forecast the popular vote winner of all seven presidential elections from 1984 to 2012, usually months or even years prior to Election Day. The chart to the right shows the Keys model’s vote share forecasts in comparison to the actual vote for the incumbent party’s candidate in each of those election. On average, the Keys model missed the final election result by 2.4 percentage points.

-snip-

>https://pollyvote.com/en/components/models/mixed/keys-to-the-white-house/

]]>+1. On the other extreme was Sam Wang at Princeton who called it 99% for HRC IIRC. I believe Andrew’s model was somewhere around 90% shortly before the election.

]]>Natalie,

I read your article and basically agree with everything. I would echo Andrew’s point that we are being super careful with uncertainty, most notably and novelly by (a) adding extra measurement error for other sources of non-sampling variance and (b) by attempting to adjust for partisan non-response, which to my knowledge no other major forecaster is doing.

But this is where things get tricky, right? Because if every forecaster thinks they are doing something better, there is no incentive to NOT do a forecast. We certainly believe that our forecast is worthwhile because without it, we would be left in a world with fewer good forecasters and certainly more bad punditry. But there is definitely an acknowledgement of the doom loop here.

One thing our forecast may be blind to is hereto-unforeseen biases in likely voter filters because of covid and postal service troubles. I am of the belief that such errors are about as likely as they always have been (and have been meaning to blog about this) but I think this is an area where we might be able to make some improvements.

Finally, I am pleased that we can engage in a discussion about this subject with respect and level-headedness. Not everyone in this industry is so willing.

]]>Outside of 538, there were a number of sources giving 85-95 percent to Clinton in the weeks before the election.

You called that range something between “a major upset” and “not quite a sure thing, but getting to that territory”

538 was an outlier in the predictions.

]]>Honestly it’s hard to know what to do if people think about probability this way.

Perhaps one solution is to not give numerical estimates at all. We could map probability estimates onto plain-language descriptions instead, e.g.:

50-55% chance becomes “dead heat, could go either way”

55-60%: “Candidate A has a slightly better chance of winning.”

60-70%: “Candidate A has a noticeable edge, but if Candidate B wins that will only be a slight upset.”

70-80%: “Candidate A will probably win, but it’s far from a sure thing.”

80-90%: “Candidate A is very likely to win. A win by Candidate B would be a major upset, though not a historic one.”

90-95%: “Candidate A is extremely likely to win. Not quite a sure thing, but getting to that territory.”

95%+ : “Candidate A is extremely likely to win. A win by Candidate B would be one of the biggest upsets in US presidential election history.”

I have to admit, I find it hard to come up with wording that would allow someone to put the phrases in correct order, even. And I’m not sure what problem this would solve…except my friend could not look at “Candidate A will probably win, but it’s far from a sure thing” and claim that the forecast said it was a sure thing!

]]>This would’ve been reported as 52% with a margin of error of 4 percentage points (the margin of error is 2 standard errors), thus a “statistical dead heat” or something like that.

The error seems to be describing this as a “statistical dead heat” when in fact it indicates an 84% chance of victory.

A strict reading of the notation “52% +/- 2%” yields the interval (50%, 54%). Later you talk about “margin of error” being two standard deviations, or (48%, 56%). I think reporting +/- sd is going to be very misleading to the public who will interpret any interval with more certainty than it deserves.

Speaking of N = 1, I see the *NY Times* reporting things like “This man called the last election for Trump, what does he say now?” on the top of their web site (I’m not even going to bother linking this garbage). My dad called the last election for Trump based on talking to people at bars in suburban Detroit, so maybe he should take up political punditry with a larger audience than me.

In the public’s defense, the link between the popular vote and the electoral vote is rather confusing statistically. The polls last time were pretty good on Clinton’s margin of victory in the popular vote, but not so good on Trump winning the electoral college.

There’s a fun article in a recent *New Yorker* about How the Simulmatics corporation created the furute. I suspect Andrew’s coverage on the blog will show up in six to nine months. It’s about Robert Kennedy’s application of analytics to his brother’s election in 1960. Not surprisingly, given the name, Simulmatics was a giant simulator of elections. It worked with a form of deterministic regression and poststratification (aka “Doctor P”). What surprised me is that the Kennedy election was a dead heat in the popular vote. And that so many people of both parties thought this kind of social science analytics was outright evil.

Elliott, Merlin, and I are keeping our forecast up. Why? Simplest answer is that news orgs are going to be making probabilistic forecasts anyway, so we want to do a good job by accounting for all those sources of polling error that Jackson discusses.

Academics are judged on the same basis as newspapers: the quality and quantity of their readership.

Do you think a lot of these polls are being done by organizations that are not trying to do a good job accounting for error? Or to put it another way, when combining sources, how much do you try to account for underlying bias of the source organization as opposed to error and non-representativeness?

]]>Agree with Caleb. I read “Crystal Ball” as self-deprecating. And their “About” page is consistent…

“…we’re also modest enough to know that no Crystal Ball can foresee all the twists and turns of a turbulent era in American politics. Thus, our motto remains ‘He who lives by the Crystal Ball ends up eating ground glass!'”

]]>Andrew, it seems you have improved your model to have uncertainty intervals that widen noticeably as they move into the future.

The “change of winning the most votes” in the top of the page is 98% for Biden and 3% for Trump (!). Let’s say the right figure is 97.5% for Biden.

The popular vote prediction on election date for Biden is the 95% uncertainty interval [49.3% 58.3%]. I got it looking at the chart, the only figure which is given is the point estimate 54.2% (different from the central point in the interval, which would be 53.8% according to my guess).

If the 95% uncertainty interval is calculated using quantiles, there is already 2.5% probability of Biden getting less than 49.3% of the popular vote on the election and one still has to add the probability of the [49.3% 50%] interval.

On the other hand, if it is a high-density interval it could be the case that the probability of Biden losing the popular vote is just 2.5%. Does the asymmetry of the distribution explain these numbers?

]]>I’m not sure that anyone trained in any intellectual field can fully comprehend the degree to which they’ve internalized approaches which others have not.

To me, if it is demonstrated that presenting data causes people to not vote in material numbers, then you fix what you’re saying. Without that focus, without that specificity clarifying the space to be analyzed, you could actually make communication worse by trying to make it better.

To sum, recognize the inherent nature of the problem, and wait until something gives you information which is actionable. You can better define actionable if you think about the limits of communication from the perspective of those who are not inside the bubble of your shared understandings.

]]>Caleb:

I’m doubtful. We’re living in a world where the deterministic “13 keys to the presidency” thing continues to get respectful news coverage every four years. See here, for example: https://www.cnn.com/2020/08/07/us/allan-lichtman-trump-biden-2020-trnd/index.html

]]>Anon:

We do show uncertainty intervals right on the main page of the Economist forecast.

]]>Ultimately I agree: you shouldn’t stop making election forecasts because of what *some* people do with them. Free speech is a fundamental right.

]]>For what it’s worth, I am also realistic about these models existing. My goal is not to get them taken down, because I believe that would be a futile effort. My goal is to spread caution about the models, particularly given my role in the field in 2016. I was not cautious enough – although not maliciously or carelessly, just fell into the don’t-do-anything-subjective trap I describe in the article – and that was a disservice to everyone. I think these models have been taken too seriously as because they worked for in 2008-2012, which is really not a long track record to go on. Perspective on how we discuss the models, and how media covers them, is my goal.

I am grateful that Andrew, Elliott, and Merlin are thinking so carefully about these things, and I have been impressed with their willingness to adjust the model and engage with criticism.

]]>