tsfresh icon indicating copy to clipboard operation
tsfresh copied to clipboard

Cannot add custom feature dynamically with multiprocessing

Open komodovaran opened this issue 3 years ago • 5 comments

tsfresh 0.18.0

This works:

import pandas as pd
import tsfresh
from tsfresh.feature_extraction import feature_calculators
from tsfresh.feature_extraction.feature_calculators import set_property

@set_property("fctype", "simple")
def f(x):
    return 2

def main():
    df = pd.DataFrame({"time": [1,1,1, 1, 1, 1, 1, 1],
                       "id": [1,2,3, 4, 5, 6, 7, 8],
                       "x": [1,1,1, 1, 1, 1, 1, 1]})

    setattr(feature_calculators, "f", f)

    x = tsfresh.extract_features(
        df,
        column_sort="time",
        column_id="id",
        default_fc_parameters={"f": None},
        chunksize=1,
        n_jobs=0,
    )

    print(x)

if __name__ == "__main__":
    main()

But if n_jobs is increased, it fails with

AttributeError: module 'tsfresh.feature_extraction.feature_calculators' has no attribute 'f'

How else would I add the feature without modifying the source directly?

komodovaran avatar Jun 07 '21 15:06 komodovaran

Hi @komodovaran We have documentation for this here. You should not modify the feature_calculators module directly, but we have included custom logic so that you only need to add it to the settings directory (as shown in the linked docu). If the documentation is not clear enough, please tell us!

If you want to know more about the technical details and the history of the feature, have a look into #482 and #845.

nils-braun avatar Jun 07 '21 15:06 nils-braun

Adding the following, as shown in the documentation and #845, this doesn't seem to work

settings = ComprehensiveFCParameters()
settings[f] = None
TypeError: getattr(): attribute name must be string

Nor does this

settings["f"] = None
AttributeError: module 'tsfresh.feature_extraction.feature_calculators' has no attribute 'f'

komodovaran avatar Jun 08 '21 07:06 komodovaran

Sorry @komodovaran - my bad. The information written in the documentation and to what I referred is for the current head/source version. 0.18.0 does not include this feature. Is using the source version possible for you for now? We will release a new version soon.

nils-braun avatar Jun 08 '21 11:06 nils-braun

Hi, Firstly Thank you for building this good application for time-series calculation. Can you please help me in adding new features to MinimalExtractionParameters? Please reply ASAP.`@set_property("fctype", "simple") def flor_ratio(x): """ The description of your feature

:param x: the time series to calculate the feature of
:type x: pandas.Series
:return: the value of this feature
:return type: bool, int or float
"""
# Calculation of feature as float, int or bool
u=np.max(x)
v=np.min(x)
#flor_ratio = (u-v)/v
return (u-v)/v

fc_parameters = { 'median': None, 'mean': None, 'standard_deviation': None, 'root_mean_square': None, 'maximum': None, 'minimum': None,'flor_ratio':None from sktime.transformations.panel.tsfresh import TSFreshRelevantFeatureExtractor from tsfresh.feature_extraction import extract_features transformer = TSFreshRelevantFeatureExtractor(kind_to_fc_parameters= fc_parameters) extracted_features = transformer.fit_transform(X_train,y_train) extracted_features }` But this doesn't calculate the features I gave it to calculate.

utkarshtri1997 avatar Jan 20 '22 13:01 utkarshtri1997

For anyone who is struggling, here is an update of @komodovaran's original formulation that works:

import pandas as pd
import tsfresh
from tsfresh.feature_extraction.feature_calculators import set_property
from tsfresh.feature_extraction.settings import PickableSettings


@set_property("fctype", "simple")
def f(x):
    return 2


def main():
    df = pd.DataFrame({"time": [1,1,1, 1, 1, 1, 1, 1],
                       "id": [1,2,3, 4, 5, 6, 7, 8],
                       "x": [1,1,1, 1, 1, 1, 1, 1]})

    settings = PickableSettings()
    settings[f] = None
    x = tsfresh.extract_features(
        df,
        column_sort="time",
        column_id="id",
        default_fc_parameters=settings,
        chunksize=1,
        n_jobs=0,
    )

    print(x)


if __name__ == "__main__":
    main()

jackalack avatar May 10 '22 22:05 jackalack

Is there a possibility to make a list of own created features and a subset of features, given from the tsfresh list?

enesok avatar Oct 04 '22 14:10 enesok

@utkarshtri1997 - the documentation has an example on how to add your own feature extractor (https://tsfresh.readthedocs.io/en/latest/text/how_to_add_custom_feature.html) - I guess you followed it? Your code looks reasonable (although it is very hard to read, maybe add a bit of GitHub markdown around it?). Please note however that you are using a TSFreshRelevantFeatureExtractor, which only extracts relevant features. Maybe your feature was just not relevant?

Thanks @jackalack for your example! This looks good. For those reading this thread: Please note that ComprehensiveFCParameters (the one used by @komodovaran) also derives from PickableSettings. I think the only reason why it did not work for @komodovaran back then was probably that the newest tsfresh (back then) did not include the changes of #845 .

@enesok: Sure, you can combine the default calculators of tsfresh with your own ones. You can either start with one of the provided settings (e.g. ComprehensiveFCParameters or alike) and delete the keys you do not want to have as described in the docu or you start from "scratch" from an empty dictionary:

my_settings = {
        "mean": None,
        "maximum": None,
        my_function: None
    }

extract_features(df, column_id='id', column_sort='time', default_fc_parameters=my_settings)

nils-braun avatar Feb 19 '23 17:02 nils-braun

For better tracking, I am closing the issue as the original issue was fixed and released. Please open a new one if it still does not work for you!

nils-braun avatar Feb 19 '23 17:02 nils-braun