PopularBoard's comments

PopularBoard · on June 13, 2019

I'm a little confused, how much technical this approach is? I can't understand the meaning of P(D) for example. Does it make sense in strict mathematics?

zwaps · on June 13, 2019

The approach is formally correct. You always have to make sure these values actually exist, but otherwise it goes through.

The example is probably not the very best, but P(D) may make more sense if you think of the following:

If D equals the amount of coffee I put in the grinder, then D has a certain random component. Sometimes I put in more, sometimes less - even though I aim at a specific level. This is why it is important to have a concept of P(D) in Bayes' equation. The one case where I inadvertently put in a lot of coffee should not be used for "strong evidence" - is the idea here.

vidarh · on June 13, 2019

P(D) is just the probability of D. This is common when talking about probabilities.

PopularBoard · on June 13, 2019

How can we talk about the probability of the data (D)?

vidarh · on June 14, 2019

D in this case refers to a specific set of variables that goes into brewing coffee. P(D) then refers to the probability of a given set of values for that vector of variables given all the possible values.

Don't take it too literally - P(..) here is not some well defined function, it's effectively just part of the name, as a convention for naming probabilities. I find it confusing too.

As the article points out, that set is for all intents and purposes infinite in this case, but this doesn't matter, as you can sidestep it by comparing to complementary hypotheses (which makes P(D) cancel out). This is all covered in the article.

The only maths worth reading up on to understand this article is a basic introduction to Bayes theorem - the wikipedia page is quite decent.