All I saw was the Github page of gibberish --
I don't use Github whatever the heck it is.
But your URL was fine.
So, the work is a relatively routine
application of classic work from
optimization going way back, e.g.,
to Bellman.
The "Reinforcement learning"
terminology looks like
a new label for some quite
ancient wine.
I've wondered what machine learning
had that was good and new, and so
far I've seen some that is
good but not new and some that is
new but not good.
For an application, it would be
good to justify the Markov
assumption, that is, that
the past and future of the process
are conditionally independent
given the present.
For a more detailed treatment, I'd
recommend, say,
E. B. Dynkin and
A. A. Yushkevich,
'Controlled Markov Processes'.
Hi - in a previous comment you mention a paper you wrote that describes a distribution-free multivariate anomaly detector (this is the comment: https://qht.co/item?id=9580929)
Would you mind emailing me a copy of it please? Address in profile. Thanks in advance!
For a more rigorous treatment, Andrew Ng's notes (http://cs229.stanford.edu/notes/cs229-notes12.pdf) are an excellent resource.