tsflex icon indicating copy to clipboard operation
tsflex copied to clipboard

[Feature Request] Integration with pyts

Open GillesVandewiele opened this issue 2 years ago • 1 comments

pyts contains some cool feature extractors such as WEASEL & BOSS that are quite unique to that package. A wrapper around pyts would be a nice addition!

The BOSS is concerned with time series classification in the presence of noise Fast and Accurate Time Series Classification with WEASEL

GillesVandewiele avatar Mar 16 '22 08:03 GillesVandewiele

The following snippet creates a wrapper method for sklearn.base.TransformerMixin-like classes

from tflex.features.utils import _get_funcwrapper_func_and_kwargs
def transformer_wrapper(transformer) -> FuncWrapper:
    from sklearn.base import TransformerMixin
    # assert isinstance(transformer, FuncWrapper) or isinstance(transformer, TransformerMixin)
    
    func_wrapper_kwargs = {}
    if isinstance(transformer, FuncWrapper):
        # Extract the function and keyword arguments from the function wrapper
        transformer, func_wrapper_kwargs = _get_funcwrapper_func_and_kwargs(transformer)

    assert hasattr(transformer, "fit") and hasattr(transformer, "transform")

    func_wrapper_kwargs["vectorized"] = True
    
    def wrap_transformer(X):
        try:
            res = transformer.transform(X)
            print("Transforming")
            return res
        except:
            print("Fitting + Transforming")
            return transformer.fit_transform(X)

    wrap_transformer.__name__ = "[wrapped_transformer]__" + transformer.__repr__()
    return FuncWrapper(wrap_transformer, **func_wrapper_kwargs)

I did not add this method to the collection of wrappers in tsflex.features.integrations as;

  1. It is not really convenient / ambiguous when the wrapped transformer is fitting+transforming or just transforming
  2. If this wrapper is used in a MultipleFeatureDescriptors constructor, it is very hard to make multiple copies of the wrapped transformer as not every library properly implemented __copy__ and __deepcopy__ (see for example pyts.transformation.ROCKET[https://pyts.readthedocs.io/en/stable/generated/pyts.transformation.ROCKET.html])

The code snippet is perfectly fine to use in the FeatureDescriptor constructor (solves issue 2) when you first calculate features on your training data (fitting + transforming) and then on your test / inference data (solves issue 1)

Example of correct usage of the above wrapper;

from tsflex.features import FuncWrapper, FeatureDescriptor, FeatureCollection
from pyts.transformation import ROCKET

fc_rocket = FeatureCollection(
    feature_descriptors=[
        FeatureDescriptor(
            function=transformer_wrapper(
                FuncWrapper(
                    ROCKET(100), 
                    output_names=[f"rocket_{i}" for i in range(200)]  # each kernel outputs 2 values
                )
            ),
            window=60, stride=60, series_name=sensor
        ) for sensor in sensors
    ]
)

PS: other suggestions for an implementation are welcome :smile:

Cheers, Jeroen

jvdd avatar Apr 24 '22 19:04 jvdd