The reproducibility crisis in science is still getting worse.

i can understand why we didn’t panic when the ‘false positive psychology’ paper came out.  who knows how many of those p-hacking strategies people use, much less in what combination.

i can understand why we didn’t panic when the QRP paper came out.  who knows whether people did these things just once or many times in their lives.  and i think we can all agree that, whatever our beliefs about the prevalence of p-hacking are, we could design a survey that would produce responses consistent with those beliefs. which of the two QRP papers you think is more valid is predicted by your a priori beliefs about the prevalence of QRPs.**

i can even almost understand why not everyone panicked when the reproducibility project (RP:P) came out because, as we all teach our students, we never put much stock in single studies anyway.

ok, wait, no.  i’m lying.  i can’t understand why not everyone was worried about the RP:P results.

Agreed, though probably not for the same reason as simine vazire. As I explained last time,

If the typical power of a study is 45% or less, then 55% or more of all replications will fail to hit statistical significance even if the original study accurately found a legitimate effect. So it shouldn’t be much of a surprise when a major study finds 63% of studies fail to replicate, because that’s what you’d expect from the current scientific record.

The Reproducibility Project’s results should not have been a surprise, and that is why we should have been worried. Now, sadly, we have something even more worrying to worry over.

throughout the last few years, when i have talked to people,* one of the most strongly and frequently expressed reasons i’ve heard for not panicking is that it seems impossible that p-hacking is so rampant that even a phenomenon shown in 50 or 100 studies (e.g., ego depletion) could be a false positive. if a paradigm has been used over and over again, and dozens of papers have shown the effect, then it can’t all be a house of cards.

And yet, if the rumours are true, this collaboration is about to show that ego depletion either has a much smaller effect size than was previously reported, or no effect at all. Researchers have done similar things to one or two dozen papers, but this would wipe out a dataset about an order of magnitude bigger. Full details of the rumour are over here, but… damn. The next decade or so is going to be an epistemological bloodbath.