## The hot hand and playing hurt

So, was chatting with someone the other day and it came up that I sometimes do sports statistics, and he told me how he read that someone did some research finding that the hot hand in basketball isn’t real . . .

I replied that the hot hand is real, and I recommended he google “hot hand fallacy fallacy” to find out the full story.

We talked a bit about that, and then I was thinking of something related, which is that I’ve been told that professional athletes play hurt all the time. Games are so intense, and seasons are so long, that they just never have time to fully recover. If so, I could imagine that much of the hot hand has to do with temporarily not being seriously injured, or with successfully working around whatever injuries you have.

I have no idea; it’s just a thought. And it’s related to my reflection from last year:

The null model [of “there is no hot hand”] is that each player j has a probability p_j of making a given shot, and that p_j is constant for the player (considering only shots of some particular difficulty level). But where does p_j come from? Obviously players improve with practice, with game experience, with coaching, etc. So p_j isn’t really a constant. But if “p” varies among players, and “p” varies over the time scale of years or months for individual players, why shouldn’t “p” vary over shorter time scales too? In what sense is “constant probability” a sensible null model at all?

I can see that “constant probability for any given player during a one-year period” is a better model than “p varies wildly from 0.2 to 0.8 for any player during the game.” But that’s a different story. The more I think about the “there is no hot hand” model, the more I don’t like it as any sort of default.

1. Phil says:

With regard to your statement that ‘I can see that “constant probability for any given player during a one-year period” is a better model than “p varies wildly from 0.2 to 0.8 for any player during the game.” But that’s a different story. The more I think about the “there is no hot hand” model, the more I don’t like it as any sort of default.’

I’ve never interpreted ‘there is no hot hand’ to mean the probability of success for any given player is constant over a long period. That would be a very strong version. To me, ‘there is no hot hand’ is a statement about conditional probability: the probability of hitting your next shot (or the next baseball, or whatever) is independent of your success on the past few shots. Your point is still valid — a player with a broken finger will presumably be more likely to have a string of failures than when they were healthy, and then when they’re healthy again they are more likely to have a string of successes — so depending on how one tests for the ‘hot hand’ on could see effects from injuries. But it doesn’t seem inevitable. We could imagine a ‘success probability’ parameter that varies with time, let’s say our estimate of it it is somehow constrained to vary slowly, and my no-hot-hand model is that the probability of success depends only on that one parameter, whereas the hot hand model says the probability of success depends on both the value of that parameter and whether I’ve made my two previous shots. Neither of these would require ‘constant probability for any given player during a one-year period.’

To me, the ‘no hot hand’ does make sense as a default. It’s not that there could really be no hot hand effect whatsoever — as has been discussed many times, except for some rare examples involving physical constants, no effect is literally zero. But (1) you can easily imagine the ‘hot hand’ effect going the other way: maybe if I’ve made my two previous shots, I start taking harder shots so I’m less likely to hit the next one. And (2) the effect doesn’t have to go the same direction for every player. Maybe some players are more likely to hit again after hitting a few in a row, and others less. Before seeing any data, I would expect a non-zero effect associated with having hit some previous shots, but I wouldn’t know whether it should be positive or negative for the average player, much less for any individual player. It’s impossible for me to return to the state of ignorance I had before hearing announcers and fans talk about the ‘hot hand’, and before hearing about the ‘hot hand fallacy’, and before hearing about the ‘hot hand fallacy fallacy’, but if I imagine myself in that state I can imagine thinking “the biggest effect by far is that some players are better shooters than others, and any effect associated with a string of previous successes is going to be a small modification of that, and could go either direction.”

Putting it all together, I agree that knowing what I know now, ‘no hot hand’ is not a good default assumption. But (a) it’s still not a terrible one, and (b) it was definitely not an obviously poor choice given the state of understanding as recently as a few years ago.

• Daniel Lakeland says:

I am dictating to my phone while watching kids practice soccer so apologies for typos or wordos below

the question is ill posed in my opinion. The biggest problem is figuring out what is meant by probability in these questions. we know basketball players are not carefully constructed random number generators. but if you try to pretend that they are you will have some sense of a “true” probability, because random number generators have actual parameters that exist in the world. from this perspective the question is what actually happened to the parameter p which was used to generate the actual sequence of Hit or Miss values that occurred. and did it very through Time? I think this question is literally insane it’s like asking what did the extraterrestrial aliens do with the knobs that control the brains of the players. the aliens don’t exist and neither do the “true” p values.

on the other hand you could ask a question about how much information you have about whether a particular shot will be made or not. from this perspective it’s absolutely the case that probability, meaning your estimation of what is going to happen, changes from shot to shot dramatically. you might observe that a particular shot is taken right after a player is bumped and so you have a reason to believe that he will be off Target. on the other hand a player may take a layup without any pressure at all, and your estimation will be this is very likely to succeed. from this perspective probability is a function of the information used to construct the model. because different people have different sets of information they all have different probability assignments. if you are actually watching the game, information is coming in at all times. you may see a player begin to limp or you may see him smoothly controlling his body with greater than normal skill.

from the perspective of the second question I think the only meaningful hot hand question to ask is if you know only the information about who is shooting and what their fitness was at the time the game started, and their historical performance, does updating the success probability as a function of time offer improvements in predictive accuracy, and if so what kind of dependent likelihood makes sense, and what kind of information should you use?

from the point of view of the second conception, the hot hand absolutely exist in the sense that the teammates who see that “Joe is hot” have updated their probability assigned to a sequence of above normal frequency of successes coming up. whether they are actually correct or not in terms of predictive accuracy is a different question.

we can use the surprisal measure I was talking about in the prediction market Post in this context as well. imagine the teammates see that Joe is hot and update their p value now we can look at the following sequence of say 10 or 15 shots and asked whether that sequence became more surprising or less surprising compared to the original pre hotness model. if they increase their probability of success and the following sequence became less surprising then they were correct. if they increased their probability assignment and the following sequence became more surprising then they were wrong. we can actually ask them Right before every shot to give their updated p we can compare the surprisal under their continuously updated model with the surprisal under a less frequently updated model maybe even a constant for the whole game.

on the other hand the teammates are seeing the actual game, an analyst far away and later in time is seeing a single string of bits hit or miss we can ask the question should the analyst use information about how many recent hits there were to update the analyst estimation of whether the following sequence will contain more or less hits. this is a difficult question to answer because the string of hits or misses is itself a very low quantity of information compared to what the teammates had while watching the game.

even if we prove that the analyst cannot extract enough information from the string of bits to be able to successfully update their model of success or failure in order to reduce surprisal in a consistent way, this doesn’t prove that the teammates doing it with a completely different and richer set of information are wrong

• jim says:

“on the other hand the teammates are seeing the actual game, an analyst far away and later in time is seeing a single string of bits hit or miss we can ask the question should the analyst use information about how many recent hits there were to update the analyst estimation of whether the following sequence will contain more or less hits. this is a difficult question to answer because the string of hits or misses is itself a very low quantity of information compared to what the teammates had while watching the game.”

Badabing! Yes, totally! I was thinking about this a while back also for stock trading. If a person follows a stock or group of stocks closely over a long period (years) and is aware of both fundamentals and news that impacts the individual stock price and the overall market, they can profitably trade on swings of the stock. But just analyzing the ticker tape of a random group of stocks ten years down the road without the full context of the moment gives the mistaken impression of a “random walk”, in which it’s impossible to consistently trade profitably.

Exactly the same as the hot hand problem. Just imagine you were buying, selling and shorting investments on the next hundred shots / pitches / at bats.

If I were a “player performance” investor, I’d be watching for a slide in Verlander shares going into the next world series! :)

2. jim says:

There is an entire class of problems for which the some perceived change is hard to detect because the baseline is variable and the perceived change is modest magnitude and/or duration relative to baseline variation.

With the hot hand, if someone’s shooting percentage jumped by half for a month (say, 60% to 90%) then fell off again, everyone would agree the player had a hot hand for that month. But in reality the hot hand is a modest improvement (say 60% to 70%) over a short time (a game or two or three) relative to a fluctuating baseline, so detecting it is extremely difficult.

Another prob that comes to mind is minimum wage. No one creates 50-100% increases in min wage because everyone pretty much agrees that would put the brakes on job growth. So we get 10-15% ever other year or something like that, and the impact on job growth would be hard to detect against a background with variation. Here in Seattle we recently had our gigantic study on the 50% increase in min wage over five years. But even then, it’s taking place against the background of screaming growth, so the impact might be minimized.

• to understand the difference in information content consider that a player might take one shot per minute and whether they hit or miss is one bit of information. this means we are getting about one bit per minute of incoming information if we’re the analyst. on the other hand a decent television broadcast is perhaps 10 million bits per second obviously not all of it is relevant to the question of how the player is doing but if even one part in 1000 is relevant we’re still getting 10000 bits per second, or 600000x the rate for the analyst.

asking whether a person with 600,000 times as much information as you should be updating their probability assignments rapidly it’s not that weird of a question.

• Anonymous says:

dang it this was supposed to be attached to my post above that’s what I get for editing comments on my phone

• Daniel Lakeland says:

this was supposed to be attached as and extra to my comment above not to Jim’s comment but it got confused due to editing on my phone

• jim says:

OK, that makes more sense!

Your phone knows “Target” is a store! :)

3. Dan F. says:

The reason for assuming constant p is de Finetti’s theorem on exchangeability. If shots in basketball are not exchangeable, then this may not be a good assumption. In that case one needs quantitative measures of the failure of de Finetti’s theorem – by how much does it fail if shots are almost exchangeable in some precise sense – and the conclusion may or may not be that assuming p nearly constant is reasonable. There’s not a lot of work in this direction, but there is https://arxiv.org/abs/1906.09507. They claim that exchangeability is “robust” in reasonable examples. I have no idea how this plays out in the basketball example.

• Andrew says:

Dan:

In a Bayesian context, exchangeability of a joint distribution is simply invariance of the prior distribution to permutations of the indexes. If we have data on a sequence of shots, the indexes represent a time ordering and contain information. I agree that variation of the marginal probabilities is only one aspect of lack of exchangeability or time dependence in the distribution.

In practice, the answer to “How close is the joint distribution to exchangeability?” depends on what question being asked. For a simple example, if you use a player’s free throw data from the first half of the season to predict his shots in the second half, then you’ll make predictable systematic errors.

4. Rahul says:

This hot hand bit seems to me like a classic example of an ill posed problem. They are all hunting for a subliminal effect in a noisy system where things cannot be controlled very well in any case. People should just admit that they cannot prove nor disprove it and stop looking.

Furthermore every time someone reports lack of an effect there’s this moving of goal posts. Somewhat of a no true Scotsman dance. “Well when I speak of a hot hand THIS is what I mean”. It’s becoming like a group of philosophers debating endlessly over a vaguely defined concept.

Can anyone point to even a canonical description of what we mean by the “hot hands fallacy”? And are all the articles etc. we talk about using this same definition?

• Phil says:

You’re not wrong about the lack of a strict, agreed-upon definition of ‘hot hand’, but I don’t think the question is as empty as you imply, either. People have gained some insights into looking for a ‘hot hand’, and the ‘hot hand fallacy fallacy’ is a nice example of how even professional statisticians can miss a statistical problem that has been in plain sight for decades. People are learning by looking into this question, even if ‘this question’ is ill-defined.

Generally speaking, belief in the ‘hot hand’ is a belief in a short-lived increase in the probability that a given athlete will be successful in a specific task (usually making a basket or hitting a baseball, but people have looked at other stuff too). I think you know this, but other readers may not.

The general concept is pretty clear and most people seem to agree that it fits what sports fans mean when they say someone has a ‘hot hand’. But converting the general concept into a specific statistical question is harder. One of the issues is that qualifier: ‘short-lived’. If a player simply improves over the course of a season and ends up shooting 53% after starting at 43%, we don’t say they got hot, we say they got better (or maybe just their shot selection got better). Also, how big an improvement does there have to be in order to say someone ‘got hot’? 10 percentage points? 20? 40?

• > Generally speaking, belief in the ‘hot hand’ is a belief in a short-lived increase in the probability that a given athlete will be successful in a specific task

Let me rewrite that as: “belief in the ‘hot hand’ among some people is a belief that aliens from space have tweaked the knobs that make LeBron so good up a few notches for a short time, whereas belief in the ‘hot hand’ among other people is belief that the people watching the game are justified in increasing their assessment of whether LeBron will make a shot upwards for a short while based on information they have about his recent performance”

People with PhDs tend to favor the first interpretation, and they talk about whether the knobs were “really” tweaked or not (the probability “really” changes over short times)… sports fans who haven’t been properly educated by universities tend to think like the second thing “I think if they give it to LeBron he’s going to slam it in right now because he’s playing really well, but last night he wasn’t doing nearly as well”

Thankfully we have erudite professors to let the sports fans know that the aliens are keeping the knobs in place… and other erudite professors who have discovered the changing knob fallacy fallacy and know that actually the knobs are changing slowly… further erudition will probably result in a more accurate assessment of the frequency with which the knobs are oscillating….

• Phil says:

One of the things I liked about the original paper (or at least the first one I saw: Gilovich, Vallone, and Tversky) is that they didn’t just look at the stats, they also looked into the issue of why people believe in the ‘hot hand’ even though (they thought) the effect does not exist at a meaningful level. In essence: any time people saw a player hit three or four shots in a row, they figured the player was ‘hot’ because a string like that “can’t” be just random. They showed people — college students — strings of successes and failures, one of which was iid and the others having different degrees of positive or negative autocorrelation, and asked them to pick which one was just coin flips. Overwhelmingly, people picked one of the anti-correlated ones; that is, one where a failure is more likely to be followed by a success than by another failure, and vice versa.

There is no question that people see patterns in randomness, and overestimate the amount of ‘non-randomness’ that is needed in order to produce an observed pattern. It seems clear to me that this effect leads casual observers to vastly overestimate the ‘hot hand’ effect. It’s true that statisticians who looked into this issue _underestimated_ the effect for decades, by using a flawed approach to looking at it, but that doesn’t mean the average sports fans are right either.

I’m not really arguing a specific point here, just saying the average fans are really really wrong, and the erudite professors have been rather wrong too.

• I actually agree with you! I think talking about this question brings out serious flaws in people’s understanding of what they are even doing when they talk about probability models. Those with no education tend to “see patterns in randomness” so that even the outputs of say cryptographically strong random number generators will be confused with “information”.

Those with education tend to “think randomness is a physical property of a system to be objectively quantified”

Both can learn from this discussion… In the end, the discussion isn’t really about sports at all… it’s about misconceptions.

• Matt Skaggs says:

Rahul wrote:

“This hot hand bit seems to me like a classic example of an ill posed problem. They are all hunting for a subliminal effect in a noisy system where things cannot be controlled very well in any case. People should just admit that they cannot prove nor disprove it and stop looking.”

I will go even further. Lots of really smart and capable folks have given it their best shot (including my psychology professor brother-in-law), and yet everyone is still arguing back and forth on whether it is a fallacy or a fallacy fallacy. What we have here is a classic example of an ill posed problem, one that could substitute for a more formal definition.

For me, the hot hand happened when all five fingers ended up pointed towards the basket after the shot, meaning my wrist did not twist when I released the ball. Those shots went in with uncanny regularity. Some times I could will it to happen after a bad first half, sometimes I could not will it to happen at all. Overall, I was a rather mediocre shooter. I cannot imagine a statistical construction that would have predicted my performance.

One thing I have noticed from baseball Sabermetrics is that the predictions work very well at the team level, but the variation is hopelessly high for individual level performance. The team level is the exact same individual data rolled up. Could this be a useful definition of an ill posed problem?

• I take the view that there is no *physical* probability that is a property of the atoms we call “the player” (or the fictional space aliens in my post above that stand in for the same concept). I don’t see how anyone could disagree with this.

The probability is an informational construct in our brains… how likely do we think it is for someone to do well right now… obviously, this depends on all the information we know about the question, such as say the last 15 minutes of video coverage of the game, and whether the player is our brother in law and is having marital difficulties with our sister, and if the player recently took a two week vacation to Florida to go bone-fishing vs just finished an intensive 2 week practice session with a special shooting coach…

On the other hand, someone else who has only a string of ones and zeros to show whether a player hit or missed their last N shots will unsurprisingly have a different assessment.

If you avoid the totally problematic question of whether “the probability” (a physical property of the atoms we call “the player”) “really does change” you are left with the following possibly dramatically more well-posed questions where it’s relatively trivial to provide existence proofs for the correct answer:

1) Is there any information at all which should make us change our assessment of the probability to hit the next shot, or maybe the next few shots, or maybe the next game-full of shots? This is trivially answerable as “yes”. As an existence proof, for example, what if we know that they’re playing a player with a known significant injury because the only person who can sub has an even worse injury? What if just as the player went up to take a 3 point shot a crazed sniper shot the ball with a pellet gun and it exploded? What if as the player released the ball, strange temperature related expansion stresses caused the backboard to shatter and the hoop to drop to the floor…

2) Is there any information that we can act on based on the coverage during the game or watching the game from the bench, where no outlandish differences are occurring compared to a typical game (no snipers, broken baskets, troublesome injuries etc) that should change our assessment maybe even from shot to shot? This is trivially “yes”. As an existence proof: when the player gets a rebound 2 seconds before the final buzzer and flings it down the whole court to try to make a basket and tie or win the game, it makes sense to give that shot a lot lower probability than a typical freethrow for example. It makes sense to give layups with no defense coverage a lot higher probability than 3 point shots under heavy defense… Therefore it actually makes fine sense to say that “the probability oscillates wildly from 0 to 1 from shot to shot” depending on if the shooting circumstances do in fact change a lot from shot to shot, and you have this information about the shooting circumstances.

3) Is there any information in a sequence of 0,1 values that just tell us whether a shot was hit or missed that should make us change our assessment of the probability of the next shot value? Trivially the answer is yes, but the *magnitude* of the change based on only 1 bit of information is ultimately going to be small… But as an existence proof, suppose that we have the following sequence of hit/miss:

1010111010010000010101110111111111111111111111111111111111111111111111

In the first 10 shots the frequency is 6/10, to say that you should estimate the probability at 0.6 and hold it there to the end is obviously insane. Clearly something happened like maybe the player entered a freethrow contest and is extremely good at freethrows and hit one after another after another…

4) If you assume a particular form for the generative probability model, in which p is a particular function of shot count or time, and observe a sequence of 0,1 values, which sequences of p values are supported by the model and the data, and which are not, and does the supported set include p-sequences that vary considerably throughout the game? You can answer this for yourself in Stan… What you’ll find is that it depends wildly on the form of the generative model, for example if the generative model has as input the position on the court that the player takes the shot from, the time that the player has been playing so far today, the team against which the player is playing, and an assessment of the fitness/health status of the player, it will get a wildly different result than if the model assumes only that p is representable by a fourier series with 10 specific low frequency terms…

The biggest problem for the “hot hand” is that people *still* keep trying to estimate a “true” probability which is a property of the atoms we call “the player”. We will answer this question directly after we get a final answer for the age old question about angels dancing on heads of pins…

5. Anoneuoid says:

The outcome of a coin flip is deterministic, but we typically lack the required information to predict the outcome. So instead we use a probabilistic model saying the odds of heads is 50%… which is better than nothing. Same thing for assigning a value p for each player. That value reflects the average probability of making a shot, it need not reflect an actual internal state of the player to be useful.

But I would say given that players always warm up and get fatigued, some kind of poisson binomial model would be better.

6. If you want to understand how probability changes with information, consider what your brain thinks during this video of a pitch-in in 1982 by Tom Watson at Pebble Beach?

at 0:24 you think “hmmm he’s in the rough, it’s a hard shot, he’ll probably get close but it’s not likely to go in… p=0.05 maybe”

at 0:30 you’re thinking “wow, that’s a very nice shot, p=0.5”

at 0:32 you’re think “it’s gonna go in! p = 0.9”

at 0:33 you know “it went in! p=1.0”

probability varied wildly from near 0 to 1.0 over 9 seconds

• Anonymous says:

Or this one from Tiger woods….

At first it’s p=0.001 or so… by 1:30 you’re saying “wow, that might go in… p = 0.7”

at 1:35 you’re saying “that’s definitely going in p=0.95”

then at 1:37 it looks like it’s going to stop just a hair short… p=0.2
then at 1:39 it does stop, and it’s a hair short… p=0.05
then as the weight crushes down the grass on the edge of the hole, it starts to move again and it falls in the hole. p=1.0

• Dzhaughn says:

I knew it would go it from the moment I started watching the video.

• Phil says:

Sure, but I’m not sure why you think this seems noteworthy. An excellent free throw shooter goes to the line to shoot two free throws with his team down by 1 with 0.2s left on the clock, his team has a better than 85% chance of winning. If he misses the first free throw, now it’s more like a 45% chance. If misses the second one too, now it’s zero. This sort of thing happens all the time.

• The point is that if you pretend you were watching the game live and didn’t know anything else, you can feel your assessment of the event “the ball will go in the hole” change as you watch the physical information about the trajectory of the ball come in… it’s very intuitive and can help people who are still confused about what role is probability playing in a sports scenario figure out the right answer to that question. Why is the probability changing? Because you see new information as the game unfolds… Why should the probability change rapidly on a shot-by-shot basis? Because you see the physical configuration of the player and ball moments before it’s released, and sometimes it “looks like it’s on target” and sometimes it “looks like he was fouled and is going to fumble the ball” etc.

As soon as you abstract the real world all away and start talking about “you see 01011101111 what do you think the probability of seeing the next shot go in is, and is it higher than it was after you saw only 01011?” it’s much less intuitive and so you can get snookered down the hole of some Prize Winning Professor telling you “if the results were binomial… blablabla” and you can start to think of the p as a property of the system as opposed to the property of your state of understanding.

The reality has no binomial p. Almost ALL of the controversy over the hot hand is an insistence on there being a “right” value of p which can change only at best very slowly. In reality this is a property of *the data analysts estimate when all the data analyst has is the historical hit/miss record*

All the content in the original “hot hand fallacy” was essentially “a bit stream is an information poor source, and so rules like ‘if you see 3 hits in a row, you should think the player is hot and immediately increase your estimate of p to a much higher one’ produce invalid inferences from the available data with the available model”

That’s true while still not proving *anything* about what a teammate sitting on the bench watching the game should think.

• Compare watching the ball roll towards the hole to having the same bitstream of video encrypted using a state of the art algorithm like AES128 with a high quality secret key that you don’t know, and then blitted onto the screen as if it were pixels.

Because of the construction of the encryption algorithm, no amount of watching this bitstream and doing supercomputing heroics on it will ever give you more information than you had before you received the bitstream. If you don’t know anything about “Did Tom Watson pitch in spectacularly at Pebble Beach in 1983 to win against Jack Nicklaus?” you will still have zero knowledge after receiving the bitstream and doing supercomputing for even a couple of weeks…

On the other hand, if I know nothing about the question before receiving the bitstream, EXCEPT the secret key… I know everything about the question after receiving the bitstream, decoding it, and watching the video for 2 minutes…

by acquiring just 128 bits of information about the key, I can now answer with certainty questions like “did he pitch in to win?” and “was he wearing a white glove?” and “was it bright and sunny or cloudy and diffuse lighting?” and “was he wearing a red jacket, a brown shirt, or a navy blue sweater?”

my probability assessment of all sorts of questions changes dramatically when I receive that 128 bit key, and receiving that 128 bit key in 2019 has *zero* causal connection to the actual event back in 1983, so probability *can not* be a physical fact *about the event*.

• Matt says:

Daniel,

It seems like you are making this way too complicated. Yes, the *probability* of some event occurring changes as information flows in — that seems pretty uncontroversial.

I guess my interpretation of the hot hand question would just be: conditional on the usual information set (historical performance, maybe a few other game-specific factors, etc.) is a player more likely to make a shot following a string of makes versus a string of misses? All the bits of information you cite, for example that their teammates will have, is not too relevant. These should cancel out, and behave like random errors, unless we are really missing something systematic. (As a side note, I am sympathetic to Andrew’s original point — if we don’t allow for the hot-hand to exist, then it seems like it *must* be the case that players’ abilities are then fixed over time. Long-term changes in ability are just functions of changes on a smaller scale. But, still I think there is a reasonable null model to be made that has time-varying ability and no “hot-hand” effect defined in a slightly different way than just “changes in ability over time on a short time-scale”.)

Also, I do think you can still think of a player having a *true* shooting probability p at any point in time. One interpretation of it would simply be, conditional on the information set that is under consideration, we expect this player to make p % of shots in the long-run (that is, any player who is the same along all of the observable dimensions at our disposal. It won’t be possible to actually find these players to get the long-run frequency, but that’s hardly relevant. That’s the point of the statistical model). But, this means that my true probability could be different than yours, if we process the information differently! Okay, then let’s define true probability as the probability that a perfect information-processing machine would come up with, given the relevant information set. Alternatively, maybe you interpret the true probability of making the shot as the fraction of “alternative worlds” out of all possible worlds that are consistent with everything that has happened up to the event under consideration, AND in which the player makes the shot. Both of these seem reasonable, and indeed, I would argue both are physical facts about the shooter. Or maybe you can’t argue that.. seems like semantics either way. And of course the probability of the player making any specific shot is 0 or 1 conditional on knowing all there is too know about the world — but we don’t, and that doesn’t seem too problematic in practice.

spent a bit of time trying to hash this out at one point.. result was pretty so-so: http://mattcourchene.com/defining-randomness/

• > that seems pretty uncontroversial

and yet, it’s at the core of almost all the controversy about the hot hand…

>I do think you can still think of a player having a *true* shooting probability p at any point in time.

So apparently my work here is not done. In what sense is it *true* that at time t=0 the probability that player A who is about to make a shot at time t=1 will make the shot with probability 0.85?

You could say “it’s true that he will make the shot” and find out later whether that’s an accurate statement, and you could say “it’s true that he will not make the shot” and later find out…. but in what sense could you find out that “it’s true that there’s 85% probability he will make the shot”

The only thing I think you can say is: given my model and all the information I have, a correct calculation of the posterior predictive distribution of my model puts 85% of the probability on making the shot… this is the “perfect information processing machine” interpretation… it’s true in the sense of not accidentally making a calculation error… but it’s not unique, it’s not a true fact about the player. If it were, there would be a unique answer, instead it’s a true fact about your model and information set.

It’s “true” in the sense that you didn’t make a calculation error, and you didn’t input known to be false information into the model, that’s it. This isn’t what the Hot Hand Fallacy means, that people state a formal model, run the calculations, but they accidentally typed some typos in the code… at best maybe Tversky was arguing that people’s shortcut calculations fail to accurately reflect any single consistent formal model… but I hardly think that’s so surprising. How many sports fans have ever calculated a Bayesian posterior with Stan?

> indeed, I would argue both are physical facts about the shooter

I’ve already shown how if I have an encrypted video, stored for 40 years on my computer, by transmitting a 128 bit encryption key to me 40 years later, my probability about Tom Watson’s pitch-in shoots from some kind of “I don’t know” of maybe 10% up to exactly 1. It can’t be a fact about the event or about Tom Watson, because it all happened ages ago.

> seems like semantics either way

in other words “the study of meaning”. Since most people don’t seem to know what the meaning of the word “probability” is in the context of shooting baskets, we have endless arguments, because people think it means something totally different.

I’m providing the semantic argument to show how this question *can not be answered* because it’s normally taken to be a question about a thing that *does not exist* namely, the “true, physical change in probability” of making a shot after having just made 3 or 4 previous shots…

At best, you can say that “whatever method people use to assess their probabilities does / does not contain useful information”. You could calculate this by checking to see if over a long string of N shots, by asking a person to rate the probability for each shot right before it happened, the overall surprisal was larger than N bits, since you can transmit the string of shots perfectly in N bits.

If their assessment resulted in less than N bits of surprisal, then you can say that they had some information.

• Matt says:

It can be a fact about Tom Watson in 1983 with a certain amount of information available to the person whose probability assessment we are considering.

• It’s a fact about the person we are asessing’s knowledge of Tom Watson. It’s the difference between watching a person fly a jet airplane on TV and flying in a jet airplane.

• Matt says:

Again, I don’t see how any of this is relevant to estimating a hot-hand. It’s a well defined question. It asks whether a shooter, conditional on a bunch of things (historical performance, difficulty of shot at hand, etc.) is more likely to make a shot after previously making one (or two, etc.) versus missing one. The fact that there is MUCH more information out there is not relevant, unless you think it correlates with making/missing the previous shot (i.e. some sort of omitted variables bias). Otherwise, this seems to be a well-defined statistical problem. While it’s certainly interesting to get into the weeds regarding the philosophy of probability, it doesn’t make any practical difference. In simpler terms, for the hot-hand question, as with most (all?) questions of causal inference, we don’t care about R-squared, we care about parameter estimates. Our model can miss a lot but still identify a hot-hand effect.

• Matt, even your confident characterization of “what the hot hand is” is way overconfident. Lots of people wouldn’t characterize that as the essential issue in the hot hand.

From Andrew’s post here ” If so, I could imagine that much of the hot hand has to do with temporarily not being seriously injured, or with successfully working around whatever injuries you have.”

Is it magically the case that suddenly your lack of injury matters only after hitting 3 shots in a row? No. The situation Andrew is thinking about is a short term improvement in performance relative to the long term average, specifically because the player is more comfortable/healthy, not because he just hit 3 shots in a row.

• Matt says:

Yes, certainly agree, there are lots of details that matter — I gave a simplified version of the problem.

I was responding to your earlier comments that, because ultimately probabilities change as information is accumulated (and eventually converge to 1 or 0 as in the golf videos), this means the hot-hand question can’t be well-defined because there is no *true* make probability to begin with. But to me that just seems misguided, and it also seems like it could be applied to all problems of statistical inference.

Regarding injury, this will be mostly controlled for. Like I said we have historical performance which will have been affected by any recent injury. We can also estimate some varying measure of ability during the game, too. I’m not saying it would be easy — I agree that separating the notion of a hot-hand from just time-varying ability is difficult, I said as much in my first comment — I just don’t agree with the way you’ve framed the issue. The fact that probability varies with our information set is not relevant to answering the hot-hand question.

• > The fact that probability varies with our information set is not relevant to answering the hot-hand question.

I’m blown away by this statement. It’s as if you said that “the fact that calorie intake varies with the amount and quantity of food you eat is not relevant to the question of whether gorging on donuts every sunday makes you fat”

What is the hot hand about if it’s not about changing assessments of shot probability? That’s 100% what it’s about. Athletes and committed sports fans see players are playing “hot” and they think they can detect it using the information available to them.

https://www.scientificamerican.com/article/momentum-isnt-magic-vindicating-the-hot-hand-with-the-mathematics-of-streaks/

The part about streakiness is just Gilovich, Vallone, and Tversky’s way of formalizing the question so they could use some standard statistical calculations available to them back in 1985 to investigate it, because they didn’t know anything about Fourier analysis, signal sampling theory, Bayesian statistics, they couldn’t even easily write simulation software to validate their own metrics, so they created a biased incorrect metric, as shown in the above scientific american article.

The actual *sports* question is: do people get “momentum” where they start to hit shots that are more difficult more easily than in the past, where they take shots with more confidence that they would have passed to someone else in the past, where they get into just the right position, with their weight balanced perfectly, etc etc. So that we see them being very accurate for an extended period, like a half, or a whole game, or maybe a couple games or a couple weeks.

A *consequence* of this phenomenon is that they’d hit longer streaks, because their shot accuracy goes up and so streaks become “More likely”. But that’s just *one* consequence. Another consequence is that for example if you chunk their shots into groups of 10 and then graph the frequency of hits within the groups of 10 through time, you’d maybe see periods of elevated frequency… but it’d be horribly noisy.

Similarly grouping into groups of 20…

However, as you get the groups longer, like 200, the chance that their period of improved performance doesn’t last the full length of the bin goes up… but the noise in the estimate of the average goes down…

As you make the bin infinitely large, you are able to estimate the average precisely, but it doesn’t have any location in time that it corresponds to, it’s just the “career average” which mixes data from their rookie year and their retirement year, and the year they played with bursitis, and whatever…

This is actually the well known quantum mechanical uncertainty principle (you can’t measure position and momentum at the same time), which is not quantum mechanical at all, but rather a property of the fourier transform: you can’t simultaneously measure frequency and location-in-time.

The windowed fourier transform allows you to get location-in-time, but the windowing process itself causes a convolution in the frequency domain, so the estimate of the say 5 minute period gets mixed up with the estimate of the constant overall average and every other period as well… so you get a lousy estimate of anything with a small window…

Do athletes have momentum where they feel good and act smoothly and effectively for extended periods from 15 minutes to a week or two? Sure, it’s obvious to anyone who’s ever played anything.

Can you detect it using the methods of Gilovich, Vallone, and Tversky using hit/miss data and a biased measure? No, and that’s expected, because they have a shitty noisy measure of the thing, and they had a computational bias in their analysis.

Does the inability to detect the thing with a shitty biased measure mean the thing is confirmed once and for all not to exist? No. That’s the “no statistically significant difference = there is no difference” fallacy that was railed against in that recent Nature paper on retiring statistical significance….

If you unbias the measure and then look at the data, does the unbiased analysis confirm the existence of periods of increased accuracy? Yes, but it’s still a poor measure.

If you take the data and do a Stan analysis do you discover evidence for lots of time-variation in the underlying accuracy? Probably… I haven’t done the analysis. I probably should.

If you add in information beyond just the string of 1,0 hit/miss to the analysis such as biomechanical data measured from video analysis, would you find that your assessment of the shot accuracy changed from minute to minute? Absolutely…

Can you resolve high frequency variation in the underlying accuracy using hit/miss data? Absolutely not because again of the well known properties of fourier analysis: Nyquists theorem says you can at best resolve frequencies up to 1/2 the frequency of the sampling period… and although shots aren’t taken at even intervals, basically you need people to take extremely high frequency shots to resolve information about short-term fluctuations in accuracy.

So if you want to see fluctuations that oscillate over say 15 minute periods and resolve them to an accuracy of say 5%, you’ll need them to take maybe 500 shots in that 15 minute period. People don’t do that.

In other words, GVT rediscovered the well known fact that poor measurements can’t resolve high frequency fourier components, plus they did it with some lousy biased measure, and then interpreted “no evidence of a difference” as “evidence of no difference”

It’s like everything that’s wrong with typical statistical analysis rolled into a nice package.

• Matt says:

Okay, I guess we aren’t getting anywhere. I’m not sure if you are being deliberately obtuse or not, but anyways not worth continuing the discussion, I guess.

• It’s funny, I had the same thought… but I really do think that you and I are both sincere, it’s just we come from dramatically different planets ;-)

7. asdf says:

You might look into how the Elo chess rating system works if you’re not familiar with it. Each player has a “rating” which corresponds to a slowly varying probability p. For example, a beginner who has played a few games might be rated 1000, 2000 is a strong amateur, 2600 is a mid-level grandmaster, the current world champion’s rating is pushing towards 2900, and the strongest computers are probably in the 3200 range.

Now when two players sit down to a game (say one of them is rated 2500 and the other is rated 2520), you can model the game as each player drawing a sample from a normal distribution centered at their rating, with standard deviation about 50 points. Then whoever gets the higher sample “wins”. Actually there is about a 20 point advantage to having the white pieces (moving first), and the samples have to differ by a significant amount or else the result is a draw. The “significant amount” is to some extent up to the players, through choices of opening systems and playing styles.

Basketball could also be like that. The person has a slowly varying rating r, and then when they take a shot, it’s like picking a sample from a normal distribution centered around r and whose other parameters depend on the type of shot etc.

• Andrew says:

Asdf:

Regarding chess ratings, see here. More generally, you’re talking about item response models, and they are used in sports when matching up offense and defense.