tslearn icon indicating copy to clipboard operation
tslearn copied to clipboard

Feature Importance/Influence in Multivariate Time Series Clustering

Open ajanadj opened this issue 3 years ago • 1 comments
trafficstars

Is there a way to determine the importance of each features in multivariate time series for the decision of the clustering? For example, feature x has the most influence in cluster y.

My time series is modeled as (n_ts, ts_length, n_dim) with n_dim as the number of features.

ajanadj avatar Jul 08 '22 14:07 ajanadj

@ajanadj Hi...I am just interested in what you mentioned. Do you know how this can be done for simple tabular data with n samples and p features? I am trying to simplify your problem. Could you please provide a reference? (If you do not have any ref or you cannot fine one...please continue)


btw, this is what I think: Let's say we have 10 samples with only 2 features (X and Y). And, let's say we have two clusters. One cluster has centroid (1, 0) and the members are very close to this center. The other one has centroid (1, 10) and its members are very close to this centroid. Can you imagine that? (feel free to plot it on a 2D XY-plane). Now, can you see what feature is more important? I think Y is the one that forms the clusters! So, one simple idea is to perform clustering on each dim and see which one gives you the highest silhouette score.

(I haven't searched about it..maybe it is wrong...this was just an idea!)

NimaSarajpoor avatar Jul 17 '22 05:07 NimaSarajpoor