I can definitely understand wanting to classify Pavlov as supervised learning. I...

graycat · on May 28, 2015

Markov decision processes is an old, mature, at times deep, and polished field. Names include R. Bellman, E. Dynkin, R. Rockafellar, D. Bertsekas.

There are connections with scenario aggregation, potentials, linear-quadratic-Gaussian certainty equivalence, currents of sigma algebras, the strong Markov property, stopping times, and much more.

Can we be more clear on just what the Markov processes involved actually are and, then, how they are to be used?

nepstein · on May 28, 2015

The README links to a blog post (http://nepste.in/jekyll/update/2015/02/22/MDP.html) which details how the library is implemented from the definition of a MDP.

For a more rigorous treatment, Andrew Ng's notes (http://cs229.stanford.edu/notes/cs229-notes12.pdf) are an excellent resource.

graycat · on May 29, 2015

Your first reference is good enough -- it's okay.

All I saw was the Github page of gibberish -- I don't use Github whatever the heck it is. But your URL was fine.

So, the work is a relatively routine application of classic work from optimization going way back, e.g., to Bellman.

The "Reinforcement learning" terminology looks like a new label for some quite ancient wine.

I've wondered what machine learning had that was good and new, and so far I've seen some that is good but not new and some that is new but not good.

For an application, it would be good to justify the Markov assumption, that is, that the past and future of the process are conditionally independent given the present.

For a more detailed treatment, I'd recommend, say,

E. B. Dynkin and A. A. Yushkevich, 'Controlled Markov Processes'.

defen · on May 29, 2015

Hi - in a previous comment you mention a paper you wrote that describes a distribution-free multivariate anomaly detector (this is the comment: https://qht.co/item?id=9580929)

Would you mind emailing me a copy of it please? Address in profile. Thanks in advance!