...it’s my impression that null hypothesis significance testing is generally understood as being part of a Popperian, falsificiationist approach to science.

So I think it’s worth emphasizing that, when a researcher is testing a null hypothesis that he or she does not believe, in order to supply evidence in favor of a preferred hypothesis, that this is confirmationist reasoning. It may well be good science (depending on the context) but it’s not falsificationist.

そのことを改めて強調するきっかけになったのは、Deborah Mayoのブログエントリでのやり取りだったという。

...Mayo wrote:

I’m not sure I’m getting to your concern Andrew, but I think that they see themselves as following a falsificationist pattern of reasoning (rather than a confirmationist one). They assume it goes something like this:

If the theory T (clean prime causes less judgmental toward immoral actions) were false, then they wouldn’t get statistically significant results in these experiments, so getting stat sig results is evidence for T.

This is fallacious when the conditional fails.

And I replied that I think these researchers are following a confirmationist rather than falsificationist approach. Why do I say this? Because when they set up a nice juicy hypothesis and other people fail to replicate it, they don’t say: “Hey, we’ve been falsified! Cool!” Instead they give reasons why they haven’t been falsified. Meanwhile, when they falsify things themselves, they falsify the so-called straw-man null hypotheses that they don’t believe.

The pattern is as follows: Researcher has hypothesis A (for example, that the menstrual cycle is linked to sexual display), then as a way of confirming hypothesis A, the researcher comes up with null hypothesis B (for example, that there is a zero correlation between date during cycle and choice of clothing in some population). Data are found which reject B, and this is taken as evidence in support of A. I don’t see this as falsificationist reasoning, because the researchers’ actual hypothesis (that is, hypothesis A) is never put to the test. It is only B that is put to the test. To me, testing B in order to provide evidence in favor of A is confirmationist reasoning.
Again, I don’t see this as having anything to do with Bayes vs non-Bayes, and all the same behavior could happen if every p-value were replaced by a confidence interval.

I understand falisificationism to be that you take the hypothesis you love, try to understand its implications as deeply as possible, and use these implications to test your model, to make falsifiable predictions. The key is that you’re setting up your own favorite model to be falsified.

In contrast, the standard research paradigm in social psychology (and elsewhere) seems to be that the researcher has a favorite hypothesis A. But, rather than trying to set up hypothesis A for falsification, the researcher picks a null hypothesis B to falsify and thus represent as evidence in favor of A.

As I said above, this has little to do with p-values or Bayes; rather, it’s about the attitude of trying to falsify the null hypothesis B rather than trying to trying to falsify the researcher’s hypothesis A.


これに対し私は、それらの研究者は反証主義的ではなく確証主義的な手法を採っていると思う、と返答した。なぜそう思うのか? その理由は、彼らがご立派な仮説を提示して他の人々が再現に失敗した場合、彼らは「おお、我々は反証された!素晴らしい!」とは言わないからである。その代わり彼らは、仮説が反証されていない理由を並べ立てる。一方、彼らが自分で反証する場合には、自分が信じていないいわゆる藁人形の帰無仮説を反証する。


It is tempting to frame falsificationists as the Popperian good guys who are willing to test their own models and confirmationists as the bad guys (or, at best, as the naifs) who try to do research in an indirect way by shooting down straw-man null hypotheses.

And indeed I do see the confirmationist approach as having serious problems, most notably in the leap from “B is rejected” to “A is supported,” and also in various practical ways because the evidence against B isn’t always as clear as outside observers might think.

But it’s probably most accurate to say that each of us is sometimes a confirmationist and sometimes a falsificationist. In our research we bounce between confirmation and falsification.