Time to Panic



The reproducibility crisis in science is still getting worse.

i can understand why we didn’t panic when the ‘false positive psychology’ paper came out.  who knows how many of those p-hacking strategies people use, much less in what combination.

i can understand why we didn’t panic when the QRP paper came out.  who knows whether people did these things just once or many times in their lives.  and i think we can all agree that, whatever our beliefs about the prevalence of p-hacking are, we could design a survey that would produce responses consistent with those beliefs. which of the two QRP papers you think is more valid is predicted by your a priori beliefs about the prevalence of QRPs.**

i can even almost understand why not everyone panicked when the reproducibility project (RP:P) came out because, as we all teach our students, we never put much stock in single studies anyway.

ok, wait, no.  i’m lying.  i can’t understand why not everyone was worried about the RP:P results.

Agreed, though probably not for the same reason as simine vazire. As I explained last time,

If the typical power of a study is 45% or less, then 55% or more of all replications will fail to hit statistical significance even if the original study accurately found a legitimate effect. So it shouldn’t be much of a surprise when a major study finds 63% of studies fail to replicate, because that’s what you’d expect from the current scientific record.

The Reproducibility Project’s results should not have been a surprise, and that is why we should have been worried. Now, sadly, we have something even more worrying to worry over.

throughout the last few years, when i have talked to people,* one of the most strongly and frequently expressed reasons i’ve heard for not panicking is that it seems impossible that p-hacking is so rampant that even a phenomenon shown in 50 or 100 studies (e.g., ego depletion) could be a false positive. if a paradigm has been used over and over again, and dozens of papers have shown the effect, then it can’t all be a house of cards.

And yet, if the rumours are true, this collaboration is about to show that ego depletion either has a much smaller effect size than was previously reported, or no effect at all. Researchers have done similar things to one or two dozen papers, but this would wipe out a dataset about an order of magnitude bigger. Full details of the rumour are over here, but… damn. The next decade or so is going to be an epistemological bloodbath.


Get every new post delivered to your Inbox.

Join 204 other followers