aeon
aeon copied to clipboard
[BUG] _get_shp_importance ranking features for linear & trees
Describe the bug
There are a few levels to this issue. The first and most straight forward is in the case of STC and linear classifiers, the line extracting the coefficients of the linear classifier weights is doing the inverse of what is intended. Given that a positive coef means that as a distance value increases it is more likely to be the latter class, features important to that latter class are those with a small distance and hence a negative coef. Line 695 should be changed to coefs = np.append(coefs, -coefs, axis=0).
Then the next challenge is that for RTSD only 1/3 of the features are distance metrics, so the above step wont be a simple fix here. For example @baraline mentioned that the number of occurrence can be good between 3-4 but bad after (and before), so the coefficient in linear models doesn't capture the real importance here.
As a long term goal it might worth to add a method (independent of the model used) to compute feature importance given a fitted model.
Steps/Code to reproduce the bug
In the case of the gunpoint problem _get_shp_importance(0)[0] is returning shapelets from the no gun class (encoded as 1) , it should be returning the gun shapelets (encoded as 0).
Expected results
NA
Actual results
NA
Versions
No response