aeon icon indicating copy to clipboard operation
aeon copied to clipboard

[BUG] _get_shp_importance ranking features for linear & trees

Open IRKnyazev opened this issue 5 months ago • 3 comments

Describe the bug

There are a few levels to this issue. The first and most straight forward is in the case of STC and linear classifiers, the line extracting the coefficients of the linear classifier weights is doing the inverse of what is intended. Given that a positive coef means that as a distance value increases it is more likely to be the latter class, features important to that latter class are those with a small distance and hence a negative coef. Line 695 should be changed to coefs = np.append(coefs, -coefs, axis=0).

Then the next challenge is that for RTSD only 1/3 of the features are distance metrics, so the above step wont be a simple fix here. For example @baraline mentioned that the number of occurrence can be good between 3-4 but bad after (and before), so the coefficient in linear models doesn't capture the real importance here.

As a long term goal it might worth to add a method (independent of the model used) to compute feature importance given a fitted model.

Steps/Code to reproduce the bug

In the case of the gunpoint problem _get_shp_importance(0)[0] is returning shapelets from the no gun class (encoded as 1) , it should be returning the gun shapelets (encoded as 0).

Expected results

NA

Actual results

NA

Versions

No response

IRKnyazev avatar Aug 27 '24 15:08 IRKnyazev