tslearn icon indicating copy to clipboard operation
tslearn copied to clipboard

Is tslearn fully parallelizable for large-scale time series clustering?

Open PercyLau opened this issue 4 years ago • 0 comments

Pleasure to see a parallellizable DTW metrics cdist_dtw is in tslearn. However, when the time series data set is very large, e.g., 20 GB, and run tslearn over a 50 cores server, both dtw and softdtw cannot fully utilize all 50 cores. It seems this feature is related to the implementation of TimeSeriesKMeans and the cdist_dtw, and cdist_softdtw. I am not quite sure about it and a discussion may be helpful.

Best regards, Percy

PercyLau avatar Sep 24 '20 16:09 PercyLau