I recently read this beautiful explanation of Bayes’ theorem. I’d always thought it was a statement of philosophy, but it isn’t: it comes from plain old probabilities.
The formula for the conditional probability of A being true given that B is true is
P(A | B) = P(A & B) / P(B)
That is, the proportion of things that are A that are in B is equal to the fraction of the proportion of things that are A and B over the proportion of things that are B (I like to think of these things in terms of Venn diagrams).
We can rearrange the above to get
P(A & B) = P(A | B).P(B)
Now for Bayes’ theorem: let’s write H for our hypothesis and E for our evidence. We want to know how seeing the evidence E affects the probability of our hypothesis H being true. From the first rule above we have
P(H | E) = P(H & E) / P(E)
Now we can apply the second rule to P(H & E) to get
P(H | E) = P(E | H).P(H) / P(E)
Et voila! No magic at all.
As an aside, the raven paradox has convinced me that Bayesianism is philosophically superior to frequentism.