tsfeatures
tsfeatures copied to clipboard
Feature calculation is stucked (issue on Multiprocessing lib & Windows)
What happened + What you expected to happen
As your doc mentions it should be possible to add custom feature (I copy paste your function from README) => but nothing happens after a few longs minutes
Could you please check ?
Versions / Dependencies
0.4.2 (the last one)
Reproduction script
import pandas as pd import numpy as np from tsfeatures import tsfeatures
periods = 24 ind = pd.date_range(start='2021-01-01', periods=periods, freq='MS') vals = np.random.rand(periods) df = pd.DataFrame({'ds':ind, 'y':vals, 'unique_id':1})
def number_zeros(x, freq): number = (x == 0).sum() return {'number_zeros': number}
features_df = tsfeatures(df,freq=12, features=[number_zeros]) features_df
Issue Severity
None
I'm having a similar issue. If I understand correctly, the number_zeros function will count the number of zeros for each unique_id.
def number_zeros(x, freq):
number = (x == 0).sum()
return {'number_zeros': number}
features = tsf.tsfeatures(data, features=[tsf.stl_features, number_zeros], dict_freqs={'MS': 12,})
Result is wrong because number_zeros is not supposed to be all zeros like this. In the data there are some unique ids that contain zeros.
| unique_id | number_zeros | |
|---|---|---|
| 0 | 282998 | 0 |
| 1 | 347809 | 0 |
| 2 | 489552 | 0 |
| 3 | 594474 | 0 |
| 4 | 594861 | 0 |
| 5 | 595209 | 0 |
| 6 | 595956 | 0 |
| 7 | 600426 | 0 |
| 8 | 600429 | 0 |
Currently I'm having to do this instead:
features = pd.merge(
data[["unique_id", "y"]].query("y>0").groupby("unique_id").count().reset_index(),
features,
how="left",
on="unique_id",
)
features.rename(columns={"y": "series_length"}, inplace=True)
I think that the issue is that the scale argument in ts_features is set to True by default. You should try to change that to False and then rerun.
actually issue (infinite loop ) is coming from multiprocessing => I think tsfeatures cannot be used with Windows and Jupyter notebook / IPython
I have used t features in Jupiter notebooks. Did not have any issues.
ok @ngupta23 but what is your OS ?
I used it in WSL
Windows subsystem for linux is not pure windows. ;) Multiprocessing works differently between Linux and Windows.