It isn’t about whether a model is true at all. You want to compare different models (explanations) to all the data and see which explains it best as is done by Bayes rule.

Then you can hopefully deduce other useful predictions from the model. This process does assume your premise is true, but that doesn’t mean you must believe it is true.

Also, I don’t get all the ESP hate. It made more sense when we weren’t surrounded by wifi, bluetooth, etc transmitting information invisibly over a distance without requiring much energy. I’m interested in how or why we see so little evidence of “organic” ESP, it must be a pretty serious vulnerability.

]]>“And you might well be interested on whether the data, even in a subjective probability model, is or not consistent with the notion that a particular parameter is zero, which we might term the null hypothesis.”

Sure, I agree, actually with that whole posting. But being consistent with a model is essentially different from the model being true. My “mission” here is that I object against the apparent urge of many people to assign “truth” to models. Granted, these models are surely useful in helping us to think about reality and making decisions, but they are still thought constructs and cannot be identified with how reality works. I just think that this obsession with truth tempts us to over-interpret and to claim more than we can actually achieve.

Sad as I am to leave the Crab Nebula behind, I think you’ve cleared up the difference between us. “But then testing it is pointless, because the underlying process that sends you the data is not what the model is about.” But the model is only defined up to a family of probability processes. You still have to estimate the parameters. And you might well be interested on whether the data, even in a subjective probability model, is or not consistent with the notion that a particular parameter is zero, which we might term the null hypothesis. You might similarly be interested in the question “who wrote Federalist #20” and write a multinomial probability model that encodes your epistemic uncertainty. Underlying that multinomial model is a word frequency model that maps to texts of known authorship. In principle, this model is aleatory if only you could find infinite samples of each author’s work outside of the Papers. You can’t, but few models are really infinitely extendible.

]]>“My point is simply that if ESP does not in fact exist, then even if it can’t be proven not to exist in a statistical model, the fact that it doesn’t exist can be true.” Fair enough, but that’s not a statistical null hypothesis, which was what the discussion was about. A statistical null hypothesis is a formally specified probability model.

“The decimal expansion of pi may be deterministic, but who’s to say that *everything* isn’t deterministic?” Fair enough again; it was actually my point that probability models are never true in reality.

“We can still make a probability model and that model can have a null (#8 by Hamilton, #23 by Madison, etc, etc.) and that null might be true. The process was deterministic and yet we still have a statistical null because we don’t know the process.” I think this is a fundamental source of misunderstanding. There are different concepts of probability around, roughly divided into “aleatory” (referring to data generating processes) and epistemic (referring to uncertainty of a person or humankind as a whole). You seem to be mixing them up. Normally statistical hypothesis testing is done in a frequentist (aleatory) setup. This means that the models model the data generating process, *not* subjective uncertainty. In this case whether you know or not what the exact generating process is has no implications on whether the model is true. However if you take probability models as modelling subjective (or objective, based on a set of informations) uncertainty, the model models your uncertainty and you *know* that it’s true if it matches your state of uncertainty. But then testing it is pointless, because the underlying process that sends you the data is not what the model is about.

Sorry, I won’t play the away game on the crab nebula.

]]>This is a fun discussion, albeit off the original topic, but let’s just say we disagree. On the ESP example, I agree there would be auxiliary things like exchangeability necessary to make a full statistical model, but my point is simply that if ESP does not in fact exist, then even if it can’t be proven not to exist in a statistical model, the fact that it doesn’t exist can be true. Truth and statistical evidence are not coterminous.

The decimal expansion of pi may be deterministic, but who’s to say that *everything* isn’t deterministic? We just don’t know the model that determines them. Just because it’s deterministic doesn’t mean it doesn’t lend itself to a statistical demonstration (if not a statistical proof, since the expansion is deterministic but infinite). Who wrote the Federalist Papers is deterministic, but since we don’t know who they were we carry out statistical tests with error bounds. We can still make a probability model and that model can have a null (#8 by Hamilton, #23 by Madison, etc, etc.) and that null might be true. The process was deterministic and yet we still have a statistical null because we don’t know the process.

Finally, you don’t have to know much about the Crab Nebula to make the general point that in our current understanding of causality. The unknowable true future cannot affect today’s events. And even if there’s some bizarre quantum spooky action at a distance that means it could actually have an effect, the effect on my vacuum cleaner purchase will be nil to as many decimal places as you care to write down.

]]>“Econometrics can be defined as the study in which the tools of economic theory, statistical inference and mathematics are systematically applied, using observed data, to the analysis of economic laws. It is therefore concerned with the ’empirical determination of economic laws…If the observed data are found to be incompatible with the predictions of the theory, it is rejected.” Brown (1991).”

Hendry has said that: “Theory consistent – our model should make sense.” I just don’t know that forming a new theory after the fact and not being totally honest about the timeline is good practice in this discipline.

Does that reframe the discussion somewhat?

]]>“If there is no ESP, then nulls which require the absence of ESP are true” – no, because a null as a fully specified probability model will always require more than just ESP not existing, for example independence of observations (of course you can specify models that have other requirements than independence, but “ESP doesn’t exist” alone is not a model, and when you make it a model, you will have to add requirements that will not be literally true in reality, as reality is not a data generating process ruled by formal probability models.

“I hypothesize that one-tenth of the digits following ‘3’ in the decimal expansion of pi are ‘4’” This is a deterministic process; no probability model will be true except stating that with probability one pi will be equal to pi.

I don’t know much about future Crab Nebula radio waves.

]]>I hate the inability to edit… “…without verifying it statistically.”

]]>Here’s another example: I hypothesize that one-tenth of the digits following ‘3’ in the decimal expansion of pi are ‘4’. As you point out, there is no *statistical* test to verify this null. There might be, eventually, a *mathematical* proof of this null, however, which could prove the frequentist probability model is true with verifying it statistically.

]]>What I mean is that while I agree that there is no way to *verify* it is true, it might nonetheless be true. And, as my example from the future Crab Nebula radio waves demonstrate, not everything *necessarily* affects everything else, e.g. the future does not affect the past. Similarly, if there is no ESP, then nulls which require the absence of ESP are true, whether or not we can prove it.

]]>Re true nulls: What exactly do you mean by this? A null hypothesis is a frequentist probability model, and there is no way to verify it’s true without infinite repetition, which doesn’t exist. Further everything in the world depends in some way on everything else, so no i.i.d. model will ever be true. (That’s not even all that can be said but I leave it at that.)

]]>It’s a new one for me! Thanks for the laugh.

]]>Fair enough, but (a) the null is still not everything incompatible with your expectation; (b) some nulls are true, it’s just that there are few interesting ones that are true. The null that radio waves from the Crab Nebula which will strike earth 10,000 years from have no influence on my decision as to which vacuum cleaner I will buy is true (More interestingly, there may be nulls in quantum physics, for example, that are literally true — to some (like me) carefully constructed nulls reflecting an absence of ESP may well be true as well) ; and (c) while reasonable nulls are rarely true they hold, for better or worse, a provisionally superior ontological status.

]]>It’s like the old joke about the dog chasing the car–its fine to chase the noise, you just don’t want to catch it. :)

]]>Oops, meant this to be a reply to Jonathan!

]]>This is why I call it a research “agenda”: the objective is to find out how events are related, not whether they are related in a particular way. It’s fundamentally bottom-up model-building, because you’re gradually narrowing down the values of parameters until they are consistent with only a narrow set of models. The approach is still compatible with journals’ bias toward NHST because any one study produces a p-value that may support the statistical experimental hypothesis, and if p > .05, you have something to say about the implications without it being post hoc. (All this assumes you are doing basic research and not testing a one-off intervention that’s too specific to generalize from its failure.)

]]>Of course, once I stood contradicted by the data, I noticed another aspect of the issue which in turn made me come up with a new convincing explanation which once tested resulted consistent with the results.

Because I care deeply about good science and academic integrity, I wonder how can I write my paper without breaking the conventions in academic writing while avoiding to pretend that I held my final hypothesis as true since the very beginning.

You did this: https://en.wikipedia.org/wiki/Abductive_reasoning

Then to test your theory you need to work out the consequences of it being correct and deduce predictions from it. Then compare those predictions (along with ones deduced from competing theories) to new data.

]]>No model is ever true, and the problem addressed by a test is whether the data are compatible with the H0, not whether it’s true.

]]>Michael:

Yes, it’s fine to chase noise, but then you should chase noise in all directions, i.e., do a multiverse analysis. The mistake is when a researcher picks out one particular piece of noise and grabs on to it, while ignoring or downplaying all the other correlations in the data.

]]>‘A null hypothesis isn’t “a” hypothesis, it’s all the hypotheses that are mutually exclusive with your anticipated experimental results.’

That’s not true. If it were, you would calculate p-values very differently. Indeed, the problem with the null is that it is far too sharp…. so sharp that it is almost impossible that it is true.

]]>I don’t think there’s anything wrong with “chasing” noise–it’s only a problem when you embrace noise. I think that’s what you really mean, based on your comments elsewhere–that it’s foolish to claim that failing to reject represents evidence for a post hoc hypothesis. But as long as it’s done in the context of long-term model building/theory building, that’s just science. I would say the best answer to Peter’s question is that journals like papers that make positive, declarative statements. They want discoveries, not refinements. Which is why they prioritize p-values over parameter estimates in the first place.

Of course, the exception to the rule is when the post hoc hypothesis is logically stronger than the original hypothesis, and would have been a better a priori idea to test, but it didn’t even occur to the researcher until after seeing the data (like #3 in Jonathan’s comment above).

]]>Second, you have two choices. You can write the paper up as a failure to reject, and (correctly) say in your intro and discussion sections that the implicit support for the new hypothesis is a valuable finding that should be replicated with new data. This is textbook confirmatory vs exploratory analysis. It is almost certainly publishable, but you may pay a professional price for the tier of journal that will accept it. OR: You can write it up in the traditional way, and direct readers to the preprint (or call it a technical report) for a detailed description of the analytical process. Or call it supplemental materials. You will have sewn some confusion in the literature as to the correct interpretation of your findings, but nonetheless, you’ve done your scientific duty.

Third: a null hypothesis isn’t “a” hypothesis, it’s all the hypotheses that are mutually exclusive with your anticipated experimental results. For any line of research you are committed to (most aren’t to their dissertation), try to specify these a priori–and more importantly, specify the differences in empirical consequences that would distinguish them. Structure this list as a tree diagram, with branches splitting off where empirical predictions are mutually exclusive. This can be a valuable design tool: sometimes you can design your study to exclude a portion of these even when you FTR your experimental hypothesis.

You now have an a priori research agenda. FTR and you move to the next plausible hypothesis that’s consistent with your results (understanding that this is probabilistic so new results may have you return to the “root” of your tree). Maybe the next best hypothesis doesn’t occur to you until you can look at data, but it will fit into your plan schematically–it will naturally form a new “leaf” of the branch representing hypotheses consistent with your actual results. I would argue that this isn’t just a helpful tool for planning research: your argument for secondary hypotheses is stronger, and maybe more publishable, if you can present it in this context.

]]>One of my favorite economics papers is Friedman and Ostroy, Competitivity in Auction Markets: An Experimental and Theoretical Investigation, Economic Journal, 1995, which starts (in the second paragraph, after describing the result that two sided-markets seem more competitive than we think they ought to be): “This paper began as a sharp disagreement between the two authors as to the proper explanation for the puzzle. We investigate three approaches to reconciling theory and experiment.

(1) The traditionalist approach, favored by one author….[description of approach excised]

(2) The institutionalist approach initially favored by the other author… [description of approach excised]

(3) A third approach, which we will call the as-if complete information Nash equilibrium approach (or complete information for short) occurred to us only after looking at the results of some experiments.”

The paper then goes on to provide a number of interesting tests of all three theories. But it is the transparent honesty at the start of the paper that I find so appealing. And the tests they give to the third theory suggests they aren’t just chasing noise, but that would take me too far afield.

]]>Peter:

Researchers *do* often write, and publish, things like, “I expected X and found Y, which is really exciting and makes us realise something important is up.”

The problem is, I’m suspicious of many of these claims. I’m not suspicious of the sincerity of these claims—I expect that this is really how the researchers perceive what happened—but I think that often what is happening is that the researchers are chasing noise. Yes, their findings are a surprise to them, but they’re making a mistake by generalizing from some patterns in a particular dataset and thinking that gives them more general knowledge of the world.

]]>