Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

I don't understand how the two camps can exist. Don't the two methods produce different results? Surely, only one is true. Which is it?


It's mostly a philosophical difference between thinking of probabilities as measures of relative frequency versus thinking of probabilities as measures about one's uncertainty about the outcome. There isn't so much a huge war between them as there used to be, but if you want to read about the history of that this was a book I enjoyed: http://www.amazon.com/The-Theory-That-Would-Not/dp/030016969...

Being horribly biased in favor of the Bayesian interpretation ever since I learned it was a thing I'll give an example of places that frequentists can be wrong. People who disagree can give counterexamples. ;)

http://lesswrong.com/lw/1gc/frequentist_statistics_are_frequ...

On the other hand, some argue that certain forms of inference are invalid and that it doesn't matter if they give the correct answer or not in practice because they're invalid. Calculus was attacked on this basis early on because many mathematicians thought that taking the limit of something as it approached 0 wasn't a thing you should be able to do.


Thanks, I'll read that post now. I'm actually going by Yudkowsky's post:

http://lesswrong.com/lw/ul/my_bayesian_enlightenment/

It might not have been the exact same one, it was one where he mentioned the riddle and how his friend got the wrong answer because he was a frequentist. That might be where most of my misconception arises from.


>Surely, only one is true.

This is my general comment about statistics and data analysis without going into the specifics of Freq vs Bayes:

As anyone who ever worked with real-world data can tell you, for the most part data analysis is more of an art than exact science. Sure, it's math, and once you pick the right model, there is (usually) one correct way to solve it. The problem is that most mathematical methods come with assumptions that are almost never met in practice, so you have to make a lot of (often fairly arbitrary) decisions about how to go about your data analysis. How do you handle missing data cases? Is your data normally distributed enough to justify the use of some method? Is the sample large enough? Are those residuals in your model diagnostics random enough? Does that trend line look linear enough? What prior information can I use (and how?) to formally add value to the model? Real world is messy.


Given your clarification...

The two methodologies can give somewhat different results, but not as often as you might think, and the differences aren't as large as you might think. In my experience, the instances where you get radically different answers from Bayesian/Frequentist methods are quite rare, and tend toward the pathological example invented purely to demonstrate the "superiority" of one over the other.

That said, sometimes one or the other are significantly more convenient or easy to apply for a given model or type of data.


I see, thanks. Are the differences hard to reconcile, given that we can do Monte Carlo on a model and see whose predictions are correct?

I'm just annoyed by there being two camps in science, where one gets slightly different results from the other. It seems to me that one is obviously wrong, since there's only one truth.


You think that's bad, you should check out physics: http://en.wikipedia.org/wiki/Theory_of_everything

Statistics is more about data sets and interpretation than "right" vs. "wrong".


> It seems to me that one is obviously wrong, since there's only one truth.

A bold claim.



The point of Gelman's reply is that the comic is actually comparing a Bayesian to an absurdly incompetent Frequentist, so there' really no conflict. No (modestly intelligent) Frequentist would mis-apply this methodology in this circumstance.


Sorry, I wasn't talking about the comic. In general, don't the two approaches give different results? Surely, only one is "correct".


Both are correct but they target different things. The disagreement is around what is the target should be and the advantages and disadvantages of choosing these targets. Bayesians are interested in p(unknown|data) and frequentists are interested in p(data|unknown = H0). Inference can be framed either way but means different things.


Are there any situations where you want to use a frequentist procedure?

I've concluded that given a perfect, infinite-power MCMC simulator, I would always do a Gelman-style Bayesian analysis (with model falsification and improvement), but in practice, frequentist methods are computationally convenient.

Inference can be framed either way but means different things.

A Bayesian posterior P(H|D,M) is the probability that hypothesis H is true given data D and modelling assumptions M.

What does a frequentist p-value mean?


Sure, see my link above (http://stats.stackexchange.com/a/2287/1122). If you want to put an upper bound on the worst-case probability of making a mistake, you use a p-value. If you want to express the conditional probability of a particular hypothesis given the observation (and given a prior belief), you use a posterior probability. The Bayesians also can do silly things (see the cookie example with the inept Bayesian robots). In the end there is no free lunch.


The frequentist p-value is about H0, not (directly) the hypothesis you are testing. More specifically, it denotes the probability of rejecting H0, even though it's true.


Wow thank you, this is the clearest and most straightforward explanation of the difference between the two camps in this thread.


They are both models and as such, you might consider that neither of them are "correct." But they are both useful, sometimes in different circumstances.

"Essentially, all models are wrong, but some are useful." — George Box


I see, thank you all very much for the clarifications.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: