Oh, sorry - the prior probability that any given person is Bruce Willis is rough...

Oh, sorry - the prior probability that any given person is Bruce Willis is roughly 1 in 6x10^9.

I can follow the nice, sensible text about how you can take an initial probability of something happening, and a binary test which branches into four outcomes including genuine and false positives and genuine and false negatives, and the ratios coming from the four branches describe how accurately the test discerns the two possibilities, and the more accurate the test the more tightly coupled it is to the thing you are testing for and the more it drives the initial probability one way or the other.

"A Plan for Spam" - an initial probability than an email is spam. A test such as "does the email contain 'vi4gra'" and some probabilities calculated from your email archive of how often spam contains 'vi4gra' and how often it doesn't, and how often ham contains 'vi4gra' and how often it doesn't, and you can precisely adjust your probability of an email being spam based on whether it contains that text.

Training it on your email archive means you can judge incoming email never seen before without being unfairly prejudiced against it just because it's new, only judging it on whether it has spammy characteristics - and Bayes' Theorem applied to your previous email _tells you what it means for an email to have spammy characteristics_, is that right?

It seems kind of straightforward, and yet I can't (yet) follow it through numerically so I clearly don't understand it; I may be glossing over big important parts of it. So if 300 people have a 30% probability of having a positive result from a mammography, and 60% of people who don't have a positive result aren't 40% less likely to not have a 1 in 10 chance of already being a winner ... wait, where did the train come into it again?