tsfresh
tsfresh copied to clipboard
Cannot add custom feature dynamically with multiprocessing
tsfresh 0.18.0
This works:
import pandas as pd
import tsfresh
from tsfresh.feature_extraction import feature_calculators
from tsfresh.feature_extraction.feature_calculators import set_property
@set_property("fctype", "simple")
def f(x):
return 2
def main():
df = pd.DataFrame({"time": [1,1,1, 1, 1, 1, 1, 1],
"id": [1,2,3, 4, 5, 6, 7, 8],
"x": [1,1,1, 1, 1, 1, 1, 1]})
setattr(feature_calculators, "f", f)
x = tsfresh.extract_features(
df,
column_sort="time",
column_id="id",
default_fc_parameters={"f": None},
chunksize=1,
n_jobs=0,
)
print(x)
if __name__ == "__main__":
main()
But if n_jobs
is increased, it fails with
AttributeError: module 'tsfresh.feature_extraction.feature_calculators' has no attribute 'f'
How else would I add the feature without modifying the source directly?
Hi @komodovaran
We have documentation for this here. You should not modify the feature_calculators
module directly, but we have included custom logic so that you only need to add it to the settings directory (as shown in the linked docu). If the documentation is not clear enough, please tell us!
If you want to know more about the technical details and the history of the feature, have a look into #482 and #845.
Adding the following, as shown in the documentation and #845, this doesn't seem to work
settings = ComprehensiveFCParameters()
settings[f] = None
TypeError: getattr(): attribute name must be string
Nor does this
settings["f"] = None
AttributeError: module 'tsfresh.feature_extraction.feature_calculators' has no attribute 'f'
Sorry @komodovaran - my bad. The information written in the documentation and to what I referred is for the current head/source version. 0.18.0 does not include this feature. Is using the source version possible for you for now? We will release a new version soon.
Hi, Firstly Thank you for building this good application for time-series calculation. Can you please help me in adding new features to MinimalExtractionParameters? Please reply ASAP.`@set_property("fctype", "simple") def flor_ratio(x): """ The description of your feature
:param x: the time series to calculate the feature of
:type x: pandas.Series
:return: the value of this feature
:return type: bool, int or float
"""
# Calculation of feature as float, int or bool
u=np.max(x)
v=np.min(x)
#flor_ratio = (u-v)/v
return (u-v)/v
fc_parameters = { 'median': None, 'mean': None, 'standard_deviation': None, 'root_mean_square': None, 'maximum': None, 'minimum': None,'flor_ratio':None from sktime.transformations.panel.tsfresh import TSFreshRelevantFeatureExtractor from tsfresh.feature_extraction import extract_features transformer = TSFreshRelevantFeatureExtractor(kind_to_fc_parameters= fc_parameters) extracted_features = transformer.fit_transform(X_train,y_train) extracted_features }` But this doesn't calculate the features I gave it to calculate.
For anyone who is struggling, here is an update of @komodovaran's original formulation that works:
import pandas as pd
import tsfresh
from tsfresh.feature_extraction.feature_calculators import set_property
from tsfresh.feature_extraction.settings import PickableSettings
@set_property("fctype", "simple")
def f(x):
return 2
def main():
df = pd.DataFrame({"time": [1,1,1, 1, 1, 1, 1, 1],
"id": [1,2,3, 4, 5, 6, 7, 8],
"x": [1,1,1, 1, 1, 1, 1, 1]})
settings = PickableSettings()
settings[f] = None
x = tsfresh.extract_features(
df,
column_sort="time",
column_id="id",
default_fc_parameters=settings,
chunksize=1,
n_jobs=0,
)
print(x)
if __name__ == "__main__":
main()
Is there a possibility to make a list of own created features and a subset of features, given from the tsfresh list?
@utkarshtri1997 - the documentation has an example on how to add your own feature extractor (https://tsfresh.readthedocs.io/en/latest/text/how_to_add_custom_feature.html) - I guess you followed it? Your code looks reasonable (although it is very hard to read, maybe add a bit of GitHub markdown around it?).
Please note however that you are using a TSFreshRelevantFeatureExtractor
, which only extracts relevant features. Maybe your feature was just not relevant?
Thanks @jackalack for your example! This looks good.
For those reading this thread: Please note that ComprehensiveFCParameters
(the one used by @komodovaran) also derives from PickableSettings
. I think the only reason why it did not work for @komodovaran back then was probably that the newest tsfresh (back then) did not include the changes of #845 .
@enesok: Sure, you can combine the default calculators of tsfresh with your own ones. You can either start with one of the provided settings (e.g. ComprehensiveFCParameters
or alike) and delete the keys you do not want to have as described in the docu or you start from "scratch" from an empty dictionary:
my_settings = {
"mean": None,
"maximum": None,
my_function: None
}
extract_features(df, column_id='id', column_sort='time', default_fc_parameters=my_settings)
For better tracking, I am closing the issue as the original issue was fixed and released. Please open a new one if it still does not work for you!