**[time to roll up our sleeves]**

The analysis is going to be pretty math-heavy, but it’s nothing you haven’t seen before.

We begin where we began last time, with Douglas Hugh, only this time we consider the odds of his claim being false. We have that statistic on hand, 8%. Pauline Gray’s account is independent, and so also counts as a separate 8%, and by the same Smyth is another 8% added in.

maryann differs in one important aspect, that she is semi-anonymous behind her username. That 8% figure was only for named people, so we have some work to do.

To calculate (*A / **(A + B)*), we need to multiply the odds of a nest or potential nest, (*(A + C) / (all)*), by the odds of that claim being made, (*A / (A + C)*), as well as the odds of no nest (*(B + D) / (all)*) by the odds of that claim being fabricated (*B / (B + D)*), then combine them all together as per before. We have all those odds on hand, too.

Hold on, though. True claims of nesting can only occur if a nesting was attempted or completed, yet false claims carry no such restriction. That’s not to say they’re free, either, because they’re bound by the false report, odds of report, and prevalence statistics. Let’s pull back for a bit.That B partition really sticks out. A nest or attempted nest was made, but not for the person claiming one? Weird. There are three ways to handle this:

- Treat it as a true claim. It’s in line with reality, after all.
- Treat it as a false claim. This person had no justification for claiming a nest, and was correct only by fluke.
- Ignore it. It’s a weird, rare situation that’s tough to account for.

I’ll go with option two. It’s the one best in line with our intuitions about knowledge, and it’ll deflate the odds of a nest being there. It would be incredibly embarrassing to declare there was a nest, drive all the way out there, and find out we were wrong. When in doubt, take the more conservative approach.

Anyway, we already know that partitions *B *and* D* combined are 8% the size of partition *A*, and since both are independent of the odds of a nest or attempt actually occurring, they aren’t attenuated by that prevalence, allowing us to lump them together. We can mix in the factor related to “reporting” this faux nest/attempt too. So let’s flex our math, and rearrange for the odds of someone inventing a claim. We have all those numbers: the prevalence of birds nesting or attempting to nest is 29.2%; the odds of getting a report are 5%; and we already have the false report rate, 8%. Plug all those in, and the odds of someone inventing a claim are around 0.1%. That’s nearly *three* orders of magnitude lower than the rate of false reports we used! The old figure is clearly wrong; at the same time, mayrann’s anonymity makes it more likely she would invent a claim than the general public, so our replacement should be above 0.1%. I’ll be charitable, and go with 3% or about twenty-eight times more likely than the base rate. We should also revise our estimate of a truthful anonymous report becoming public downwards, as I assessed it relative to false report odds; I’ll be extremely charitable and go to 25%, in this case.

With that revision, the math is quite straightforward and results in a 67% chance of falsehood.

Which returns us to daufnie_odie. There are two ways they could be false: either they made the entire thing up, or they were fooled by someone making a false claim. We never accounted for that when we calculated that 11% before, so we need to revise the earlier diagram.

The same rationale we used with “correct by fluke” claims applies to partition *B*; it’s a mistake to think “person X lied or misled me about a nest at a specific site” is the same as “no nest ever occurred at that site,” but I’m going to think that anyway. That means there’s only one way the nesting could occur: a nesting occurred, and daufnie_odie was approached, AND that claim was a true one. As there’s no doubt about the existence of the claim, all of partitions *F* through *J* are dropped from the analysis, leaving us to calculate ((*B + C + D + E*)* / (A + B + C + D + E)*), or alternatively (*1* – (*A* / *(A + B + C + D + E))).*

At this point, it’s terribly tempting to replace *(A + B + C + D + E)* with 1; after all, that seems to represent the sum of every other possibility. But re-read the sentence before the last. We just dropped a number of probabilities from the analysis, and while they were contrary to reality they were still part of the overall universe. We can’t short-cut the math here, unfortunately.

But we can group together partitions. *C* and *E* are first-hand false claims from daufnie_odie themselves, independent of the nesting state, and like with maryann they can be lumped together. *B* and *D* are both second-hand false claims, from someone who fooled daufnie_odie, and for the same reason they too can be lumped. These lumps have a relationship, though; daufnie_odie would have no reason to invent a claim if they were approached by someone with a convincing claim. So (*B + D*) attenuates (*C + E*), and itself is attenuated by the odds of convincing someone of our claim. As the person approaching daufnie_odie is named, their false reporting rate should fall between the number I assigned for pseudo-nonymous people and the base rate we calculated earlier; I’ll be charitable and set it at 1% or nearly an order of magnitude more likely.

In sum, the numbers we need for daufnie_odie are:

But once we have those numbers, everything falls into place nicely…

… for some definition of “nice.” Biff Jag’s case is similar to daufnie_odie’s, with the main difference being that Jag had greater proximity to the nesting site and thus his second-hand transmission would be more reliable. If it’s true, of course.

Smyth has since spoken out, which should bump up the probabilities for the better. Incorporating that would be a bit tricky, though, so we’ll again be conservative and skip it. That allows us to reuse quite a bit from daufnie_odie; we’ll swap out the odds of a direct witness mentioning it to someone, for instance, and swap in the odds of Biff Jag being there shortly after the event, which I’ll put at 66%. We also need the odds of Jag being fooled, which should be lower than the false report odds (as those are investigated far after the event, with less evidence available).

Gather all this up, and we get 22%.

Tompsin’s turn. We don’t know if he was referring to two separate attempted nests, or just one. We’ll be charitable and assume the singular case, again low-balling the odds. His case is best lumped together with Puppy, as both claim to have been informed by Gruthi.

One key difference is that we have a lot more names running around. We also know that Puppy and Tompsin spent a lot of time hanging around with Gruthi, and a scan of the forum shows that Gruthi’s had plenty of opportunity to correct any mistakes, yet kept silent. A false claim by two people is much less likely than a single person, as one could easily rat out the other or get their story confused. Collusion is always a possibility, though, so we’ll be charitable and keep this very high, say at 0.5% or five times more likely than the base rate. Conversely, it’s a lot less likely that two people would be convinced by a false claim; let’s set those odds at 25%, verses the 50% we’ve been using before. We’ll also estimate there’s a 33% chance that Gruthi was on the scene.

Much like with Smyth, Pauline Gray has since spoken out about the same incident; for the same reasons, we’re going to low-ball the odds and ignore that. So this leaves the math at a 29% chance of falsehood.

Next up is Myerson, Puppy, and the unnamed person who later contacted Myerson. We’ll split this into two parts: the initial approach of Puppy and Myerson with the claim, followed by the unknown woman who approached Myerson to back up the other person’s earlier claim. This person approached after the original claim was made, and counts as an independent witness like Gruthi.

This should be fairly rote by now, so I’ll skip the table. As we did with Gray, we’ll ignore the fact that Smyth publicly spoke out. We’re not sure of the relation between Smyth and Puppy, so we’ll set the odds of contact at four in ten. We do know Puppy deliberately searched out Myerson, so the odds of contact there are quite high; let’s put it at 80%. While again we find two people potentially inventing a claim jointly, this time we’ll be even more charitable and argue the odds of that are the same as one person inventing the claim, or 1%.

The math translates into a 41% chance of falsehood.

Next, we have this other person who later approached Myerson.

The odds of an approach would be quite high, as Myerson’s post had attracted quite a bit of attention; we’ll set it at 80%. At the same time, this person is never named and is discussing something Myerson’s already stated in public, making it very easy for Myerson to invent the thing whole-cloth with little consequence, so we’ll be charitable and use the pseudo-nonymous false claim statistic for Myerson instead of the named one.

The odds of falsehood thus sit at 44%.

There’s also “skippingthem.” Their situation is almost an exact match for daufnie_odie’s, save that they’re discussing Smyth’s case instead of someone else’s. Since we’re being charitable and ignoring that Smyth came forward, the math is an exact match and still comes out to a 68% chance of falsehood.

Now for the tricky bits, like Grandie.

Reading over his account carefully, he never explicitly says there’s a nesting at 84744 M.S.; however, he was asked about the nest site, suggests several people had mentioned nest-like behavior to him, and muses of implementing anti-nest techniques on the site. It’s terribly suspicious, but how suspicious?

We want partitions *B, C,* *D, E, *and* F* in the numerator here. The first two combined are the odds of a false claim from a named person, while the last three are independent of whether or not a nesting occurred and thus just sum to the odds of Grandie discussing something else. That last number should be pretty low, so I’ll put it at 25%. The denominator is all of these together, so we also need *A. *That’s a combination of a nest or attempt occurring, Grandie hearing about a true claim (we’ll use 30%, modulated by the truthful report rate), and discussing it (I’ll assign 80%). In total, the odds of falsehood are 95%.

Finally, there’s the evidence of nesting. Myerson discovered two shifts in the nesting site, and the evidence is solid enough we don’t need to trust his word. But just because there’s shifting doesn’t necessarily mean there was a nesting.

Partitions *C* and *D* are contrary to reality, so they get struck off. We just need to know the odds of seeing those disturbances if there was a nest or attempt (I’ll use 95%) verses not even an attempt (say 10%). That puts the odds of falsehood at 62%.

Now that we know the odds of each individual case being false, we multiply them together and negate to arrive at the odds of at least one nest or attempted nest at site 84744 M.S.

There, we can be 99.9999% confident, based on the numbers I’ve supplied. That’s shockingly high, given just how often I was charitable to the no-nest hypothesis. If you think I mucked up the math, here’s a spreadsheet you can use to check my work.

But I promised you a bit more than just numbers…

Pingback: A Statistical Analysis of a Sexual Assault Case: Part Five | SINMANTYX

Pingback: A Statistical Analysis of a Sexual Assault Case: Part Three | SINMANTYX