mdc29's comments

mdc29 · on July 13, 2013

Some random thoughts (I hope this is useful)

The article and your research are quite interesting. I am currently interested in the similar things (more from the theoretical side than practical application side).

You say at the end of the article that you are interested in “Change detection” which is quite wide. What exact setting are you interested in? :

Here is a (biased, IMHO) taxonomy of change detection methods for time-series data:

1. Only samples from a time series is provided, no labels or assumptions.

2. Time series has different “modes”. For example, you have a time series of an electric motor that can either be in normal operation or abnormal operation (broken). You have “training” data that only consist of normal operation (“inlier”). You then have the “test” data that is a time series of the motor and some abnormal (failure) times (the unlabeled test dataset therefore consists of both “inliers” and “outliers”). The goal is then to identify the abnormal times (you do not have labels of the abnormal so you cannot straightforwardly calculate a classifier).

3. Time series has different “modes”. For example, you have a time series of an electric motor in normal operation. You only have “test” data that is a time series of the motor and some abnormal points (e.g. failure). The goal is then to identify the outlier points.

4. The time series have different “modes”, but you don’t have access to labeled sets. A good example is human activity recognition from accelerometer (e.g. “walking”, “running”, “dancing”, and “sitting”). This is more akin to clustering (i.e. divide the time series data into four clusters).

For (1), I don’t think one-class SVM is the best approach (there are some other methods based on divergence estimation). One-class SVM will work well for (3). There are already some techniques for (2) (I have some ideas to improve them). (4) is an interesting problem, because it is actually an “easier” problem than clustering.

I assume that you have labeled dataset. Then perhaps (2) is the best setting for determining change points. All of the training data can then be considered to be one class (“inliers”). The points at which the classes change you don’t know (i.e. “outliers”).

mdc29 · on July 13, 2013

EDIT: Sorry for the wall of text. Also, why your idea is good:

Consider a person going from "sitting" to "jogging". When the change point occurs, the reading on the accelerometer will be high for a short period (in the forward direction). As soon as he runs at his steady pace, it will be "0".

rvlasveld · on July 15, 2013

Thank you for your nice reply; currently I don't have time to look at your suggestions in depth, but I can give a (short) notion of my interpretation of 'change detection' in this context.

I will consider a signal from accelerometers in a smartphone. There are already many algorithms to determine the activity an user is currently performing (such as walking, jogging, biking, sitting, etc). Often these methods take windows of time (1-2 seconds) and compare characteristics with learned models. This way the time series data is clustered/classified into models. An additional result from this is an (implicit) segmentation of the time series arises.

My theory is that when the segmentation of the time series is first performed explicitly, the (bigger) chucks of data can be better characterized. So my goal is to find the change points in the time series; and a change points means that the underlying process (being the user carrying the phone) has changed (e.g. from walking to running).

I want to detect the change point using novelty-detection techniques; so roughly speaking: when a certain number of data points are considered "new"/"outlier", I assume there is a change in the signal and report that moment of time. Then the model adapts to the new distribution of data and watches again for a change point.

noelwelsh · on July 13, 2013

Interesting stuff. Do you have some cites for 1 and 3?

mdc29 · on July 13, 2013

For (3), classification can be performed if you have an unlabeled class from both distributions and one labeled class.

Here is an example:

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05460897

(1) usually involves estimating a divergence (such as Kullback-Leibler) with a sliding window and comparing the result.