**Tags**

If you’ve read the entire series, you probably think that invoking Bayesian statistics is a huge pain. In general that *is* true, since you’re typically dealing with non-parametric distributions which can only be tackled with a lot of numeric integration.

There are specific instances where that’s not the case, however. If your prior and likelihood function follow the normal distribution, the posterior will also follow the normal distribution. Better still, you can easily calculate its exact parameters! If all you care about is the mean, plus the prior has a mean of *μ _{0}* and standard error

*SE*, and you make

_{0}*N*observations that have mean

_{E}*μ*and standard deviation

_{E}*σ*, then your posterior will be described by:

_{E }Distributions that remain unchanged during Bayesian updating are known as conjugate distributions, and are analytic gravy. The normal distribution is especially handy as measurement errors generally follow it, so progressively refining your uncertainty is as easy as repeatedly chaining together your means, standard deviations, and pooled measurement counts. Things are even easier if you know the standard error of each observation, as you can swap that in for the standard deviation and set *N _{E}* to 1.

Why the difference? The standard deviation of your pool of observations relates to the observations, while the standard error actually relates to the mean. It’s the confidence interval of where the true population mean lies, based on your observations, hence the word “error.” The standard error isn’t a parameter of the data, it’s a parameter of another *parameter,* and given the prefix “hyper” to mark it as special. That also means the prior distribution is actually a *hyper*prior, since it’s a probability distribution of our confidence in various means, and we’re left with a *hyper*posterior after our calculations.

It’s pretty confusing, I know, especially since the normal distribution has a convenient property relating the standard error of the mean to the standard deviation.

One way to cut down on the confusion is to pay close attention to the wording. If you see “error,” “confidence,” or “credible” mentioned, you’re dealing with a hyperparameter. Remember too that the only parameter we care about above, and in the following examples, is the mean. The standard deviation of future measurements isn’t being predicted here, otherwise we’d need a confidence value for that too and we’d be tracking four parameters instead of two.

The examples should clear up the confusion. The Gravitational constant from Newtonian Mechanics is surprisingly difficult to measure. A number of authors have attempted it, and we can consolidate their work via conjugate distributions.[1] Since *G* is a constant, it should have no variance and so we’ll only track the mean. Each measurement has a standard error attached, so our work is pretty easy.

Look at that! No programming was involved, in fact the math was trivial enough to do in a spreadsheet. Up for another round? Let’s hit the USA presidential polls![2]

Hold on a second, though, no-one lists a standard error for their numbers. Not to worry, they do have something called the “margin of error” (hey, “error!”) which is almost always a 95% confidence interval (“confidence!!”). Since that interval spans +/- 1.96 standard errors, we have enough info to run the stats. Again, we don’t care about the standard deviation here, as there’d be no variance in the level of support if we were able to question everyone in the USA.

Let’s ratchet up the difficulty level another notch. The inspiration for this addition was an excellent guest post at Sometimes I’m Wrong.

This chart summarizes all the results from subject of the post, a study which did multiple experiments around the premise that self-control in one domain will translate into others.[3] It’s all well and good, but the main metric of significance is the p-value. Ugh. What can a Bayesian approach tell us?

The standard error is already in place, but what’s this “Std. diff in means (d)?” That’s a measure of effect size, specifically Cohen’s *d*. Never heard? It’s all the rage in meta-analysis, and fairly easy to understand. Basically, you subtract the mean of the control group from the mean of the test group, and divide by the pooled standard deviation from both.

At this point, it would be very tempting to leap on the fixed standard deviation of one, and use that in the analysis. That applies to the data, however, not the mean, hence it shouldn’t be a part of the hyperprior. We should just stick with the listed standard errors.

With a Gaussian distribution in place, the natural next step is to sample from it. The easy way to do this is by calculating a discrete odds ratio between two hypotheses:

H

_{1 }: Self-control is real, with effect sized= 0.219099.

H_{0 }: Self-control is a statistical artifact, withd= 0.

As you’ll remember from last time, though, the discrete approach isn’t very realistic. So let’s also do a simple range:

H

_{1 }: Self-control is real, with an effect size betweend= 0.5 andd= 0.2.

H_{0 }: Self-control is a statistical artifact, withdbetween 0.1 and -0.1.

As it turns out, this is trivially easy to do in a spreadsheet.

Overall, that’s decent but not ironclad evidence in favor of self-control across a variety of conditions.

Did you notice the conjugate distribution’s mean was in perfect agreement with the pooled Cohen’s *d *in the original paper? There’s a good reason for that: the conjugate Gaussian distribution for the mean is a simple weighted average.

This also means it’s trivial to whip up a home-brew meta-analysis. All the usual caveats about garbage-in apply, of course. No *d* value listed? Converting to it is pretty easy. This opens up a lot of doors to applying Bayesian statistics in practice, no doubt helped by the above examples in spreadsheet form and the abundant number of conjugate distributions.

I’ll try to have another practical example up shortly.

[1] Pitkänen, Matti. “Variation of Newton’s Constant and of Length of Day.” *Prespacetime Journal* 6, no. 5 (2015).

[2] “RealClearPolitics – 2016 Election – General Election: Trump vs. Clinton.” Accessed May 28th, 2016.

[3] Tuk, Mirjam A., Kuangjie Zhang, and Steven Sweldens. “The Propagation of Self-Control: Self-Control in One Domain Simultaneously Improves Self-Control in Other Domains.” *Journal of Experimental Psychology: General* 144, no. 3 (2015): 639–54. doi:10.1037/xge0000065.