aeon icon indicating copy to clipboard operation
aeon copied to clipboard

[ENH] Implement Merit Score Function channel selection algorithm

Open TonyBagnall opened this issue 9 months ago • 0 comments

Describe the feature or idea you want to propose

Merit score function algorithm described in "A Feature Selection Method for Multi-dimension Time-Series Data"

https://link.springer.com/chapter/10.1007/978-3-030-65742-0_15

A method based around one nearest neighbour classification with dynamic time warping (1-NN DTW) is described in \cite{kathirgamanathan20mtsc}. A merit score function (MSTS) is used to assess the quality of a subset of dimensions. The DTW distance function between cases and dimensions is precalculated. A prediction for each dimension pair is found through a three fold cross validation of 1-NN DTW. Similarity between each dimension is estimated using the adjusted mutual information (AMI) between the predictions of dimensions (dimension-to-dimension) and for the predictions of each dimension and the class (dimension-to-class). The MSTS for any subset of dimensions is a function of the average of the dimension-to-dimension and dimension-to-class AMI. A subset of features is chosen either through enumerating MSTS for all $2^d$ feature combinations, or using a wrapper on the top 5% of subsets. The algorithm first calculate the dimension-to-class (DC) correlation for each dimension which is the accuracy of the predictions $\hat{y}$ on train data by cross validation with 3 folds. Second, the dimension-to-dimension (DD) is calculated by the adjusted mutual information (AMI) between the predictions of each pair of dimensions. Finally, for each possible subset, the merit score function is calculated as follows:

$MS(subset) = \frac{k \overline{DC}}{\sqrt{k+k(k-1)\overline{DD}}}$

Where $\overline{DC}$ is the average of dimension-to-class of each dimension in the subset and $\overline{DD}$ is the average of dimension-to-dimension of each pair of dimensions in the subset. The evaluation of all dimension combinations makes MSTS infeasible for very high dimensional problems. MSTS has recently been applied to sensor data, and used in conjunction with ROCKET "Feature Subset Selection for Detecting Fatigue in Runners using Time Series Sensor Data", https://dl.acm.org/doi/10.1007/978-3-031-09037-0_44

Describe your proposed solution

Implement as a BaseCollectionTransformer in the channel_selection package

TonyBagnall avatar Apr 27 '24 13:04 TonyBagnall