> There are real examples of Bayesian estimators, for concrete and practical problems such as clustering, that give the wrong estimates for parameters with high probability (even as the sample size grows arbitrarily large).
Could you give some specific examples, and/or references? This is new to me, and I would like to read deeper into it.
Thanks for the detail! I took a look at the first paper, the result was new to me.
In the vogue days of reversible jump MCMC I played with mixture estimation of the number of components under a basic prior (an approach which gives decent results in Figs 1 and 3), but I never used a Dirichlet process prior for this problem. This paper points out that even this simple approach is problematic because it’s only consistent if the true distribution is such a mixture, and in my case it definitely was not.
Anyway, one takeaway, esp. from sec 1.2.1, is that the Dirichlet process prior is not suitable for estimating #components in most cases; it favors small clusters. And indeed, the concept of estimating #components is tricky to begin with, as noted above.
Just because you can compute the posterior, doesn’t mean it’s saying what you think it is about the underlying true distribution!
Could you give some specific examples, and/or references? This is new to me, and I would like to read deeper into it.