So far, we’ve looked at 20 different datasets through six different hypotheses:
- H0: Precognition doesn’t exist.
- H1: Precognition exists.
- H2: Precognition exists and it’s never worse than chance.
- H3: Precognition exists, and if noticeable it’s never worse than chance, and across the population it has a Gaussian distribution, and the average value of said distribution is too small to be noticed, and the standard deviation is small enough that only a certain percentage of people notice… roughly speaking.
- H4: Precognition exists, and if noticeable it’s never worse than chance, and across the population it has a Gaussian distribution, and the average value of said distribution is too small to be noticed, and the standard deviation is small enough that only a certain percentage of people notice.
- H5: Precognition exists, and across the population it has a Gaussian distribution with mean 0.5082795699 and standard deviation 0.01760748774.
But our original question was simply “does precognition exist.” How do all those hypotheses answer that, especially since we only have Bayes Factors?
One way is to help clarify just what “precognition exists” means. The naive interpretation leads to H1, which fails totally but also isn’t a good fit for what we mean. When you say or think “precognition exists,” do you really believe that “everyone can predict the future with perfect accuracy” is just as likely to be true as “some people can predict the future with decent accuracy?” I certainly don’t, hence why I quickly started exploring other options. Juggling all of them is easier to do in a Bayesian framework, because nuisance parameters are easier to deal with, and you’re not forced to make your hypotheses mirror one another.
H4 was the best result, for balancing specific predictions with results favorable to precognition. How “best” is it, though? How much does it change our beliefs? Let’s do the math…
Hang on, this isn’t fair; Bernstein’s priors were laid out for a hypothesis more like H1 than H4, and all those extra premises mean the priors on the latter should be a lot lower. But I can’t call up Bernstein and ask her for an update. What to do?
We do a little more math. We do know the certainty she’s attached to H0, so we can shuffle the equation around so that the prior probability of H4 is the output of those equations.
Hmmm, to what? Are we just looking for the point where we put more confidence in H4 than H0?
If our priors for H4 were above one in 965, we now favor H4 over H0; otherwise, we still prefer H0. That’s a pretty low bar, though. A single result that ever-so-slightly prefers H0 to H4 would shift our beliefs back. The only time you’ll encounter this level of proof is in a civil court case; whether or not that’s a good thing, I’ll leave to you.
Back in Frequentist-Land, we had a better bar: statistical significance. By asking the null hypothesis to be wrong by at least a certain amount, we guard against a single study flopping our beliefs around.
There, if our priors are more favorable toward H4 than one in 51, we now think of H4 as convincing relative to H0. But we’re trying to stick to the Bayesian side of the fence here. We’ve got a nice table of levels of evidence to draw from, after all. Let’s go with- oh, right, they’re all ranges of values. Ah well, that’s easily handled. So if we land anywhere in that range, we’re now “strongly in favor” of H4 over H0. We still don’t know what Bernstein’s choice of priors are, but we know what she’d conclude the moment we hear them.
Which seems like a good place to conclude this series. It’s been quite a journey, but I hope I’ve convinced you of the power and flexibility of Bayesian hypothesis testing. As a bonus, I’ve also handed you a tool capable of handling the most common form of experiment you’ll encounter. If you’d like to test out your new skills, may I recommend some homework? So far we’ve only looked at studies from researchers who are firmly in the pro-precognition camp, but those aren’t the only ones out there: two studies have tried to replicate Bem’s findings, generating five new experiments in the process. Adding them to the pool should lead to interesting results.