tslearn icon indicating copy to clipboard operation
tslearn copied to clipboard

Would you like to implement unsupervised Salient Subsequence Learning (USSL) in tslearn?

Open wzpy opened this issue 5 years ago • 7 comments

Qin published an article titled "Salent subsequence learning for time series clustering" in IEEE Transactions on pattern analysis and machine intelligence in 2018.This paper proposes an unsupervised Salient Subsequence Learning (USSL) model that discovers shapelets without the effort of labeling.This method has obvious advantages compared with KDBA, KSC and u-shapelet methods, but they did not publish the code in GitHub as described in the article. Would you like to implement it in tslearn?

wzpy avatar Dec 01 '19 01:12 wzpy

Hi @wzpy

Thanks for your feedback. This would indeed make sense. We should add it to our TODO list. However, I cannot tell when it can be added to tslearn since we have several new feature requests open at the moment.

rtavenar avatar Dec 02 '19 13:12 rtavenar

@wzpy In the article they say that a MatLab code is published but I can not find it anywhere. In general I'd be interested in the topic. Did you find it?

One detail that I can't figure out and is not very well described is how they added the "pseudo-labels" into the data. Do you have any information about that?

RychenerLorenz avatar Feb 26 '20 12:02 RychenerLorenz

I haven't read the paper, but pseudo-labeling is often (at least on Kaggle) the term for when you are making predictions on unlabeled data and then retraining the model with these additional (noisy) labels.

GillesVandewiele avatar Feb 26 '20 12:02 GillesVandewiele

Alright, thanks!

Let me read up on this and come back to you on an implementation.

RychenerLorenz avatar Feb 26 '20 12:02 RychenerLorenz

@wzpy and @RychenerLorenz : here is the github link https://github.com/JiaWu-Repository/USLM. But I prefer to warn you that I tested it and it does not lead to the paper results. From a discussion with people that reproduce the results (so not directly the main authors, because they did not answer my mail), the hyperparameters are tuned by grid search for each dataset on the train set. I'm not sure we can call this unsupervised learning.

blafabregue avatar Mar 17 '21 11:03 blafabregue

Hi @blafabregue

Thanks for your hints. Have you been able to reproduce the published results for fixed values of the hyper-parameters?

rtavenar avatar May 17 '21 09:05 rtavenar

Hi @rtavenar No, I was not able to do so as they couldn't provide the hyper-parameter to me. The online code seems to be not complete as it does not include the grid-search part. I quote the answer I received from the authors of the article "Self-Supervised Time Series Clustering With Model-Based Dynamics" (they reproduced USSL results on the 85 UCR datasets available in there supplementary) below:

For the hyperparameters of USSL: According to the suggestions of USSL's original paper, we use grid search to tune the hyperparameters, and the search range of hyperparameters refers to the original paper of USSL. If what you want is a list of specific parameter values for each data set, I am very sorry that I can't provide it to you, because I did not record it during the experiment. This is done automatically by the code.

But I had no answer from USSL paper authors to have the full code and I didn't have the time to record it.

blafabregue avatar May 20 '21 08:05 blafabregue