tsfeatures icon indicating copy to clipboard operation
tsfeatures copied to clipboard

Feature calculation is stucked (issue on Multiprocessing lib & Windows)

Open GGA-PERSO opened this issue 2 years ago • 8 comments

What happened + What you expected to happen

As your doc mentions it should be possible to add custom feature (I copy paste your function from README) => but nothing happens after a few longs minutes

Could you please check ?

Versions / Dependencies

0.4.2 (the last one)

Reproduction script

import pandas as pd import numpy as np from tsfeatures import tsfeatures

periods = 24 ind = pd.date_range(start='2021-01-01', periods=periods, freq='MS') vals = np.random.rand(periods) df = pd.DataFrame({'ds':ind, 'y':vals, 'unique_id':1})

def number_zeros(x, freq): number = (x == 0).sum() return {'number_zeros': number}

features_df = tsfeatures(df,freq=12, features=[number_zeros]) features_df

Issue Severity

None

GGA-PERSO avatar Jun 18 '23 12:06 GGA-PERSO

I'm having a similar issue. If I understand correctly, the number_zeros function will count the number of zeros for each unique_id.

def number_zeros(x, freq):

    number = (x == 0).sum()
    return {'number_zeros': number}

features = tsf.tsfeatures(data, features=[tsf.stl_features, number_zeros], dict_freqs={'MS': 12,})

Result is wrong because number_zeros is not supposed to be all zeros like this. In the data there are some unique ids that contain zeros.

unique_id number_zeros
0 282998 0
1 347809 0
2 489552 0
3 594474 0
4 594861 0
5 595209 0
6 595956 0
7 600426 0
8 600429 0

Currently I'm having to do this instead:

features = pd.merge(
    data[["unique_id", "y"]].query("y>0").groupby("unique_id").count().reset_index(),
    features,
    how="left",
    on="unique_id",
)

features.rename(columns={"y": "series_length"}, inplace=True)

truonghm avatar Aug 28 '23 08:08 truonghm

I think that the issue is that the scale argument in ts_features is set to True by default. You should try to change that to False and then rerun.

ngupta23 avatar Nov 19 '24 20:11 ngupta23

actually issue (infinite loop ) is coming from multiprocessing => I think tsfeatures cannot be used with Windows and Jupyter notebook / IPython

GGA-PERSO avatar Jan 11 '25 18:01 GGA-PERSO

I have used t features in Jupiter notebooks. Did not have any issues.

ngupta23 avatar Jan 12 '25 13:01 ngupta23

ok @ngupta23 but what is your OS ?

GGA-PERSO avatar Jan 12 '25 16:01 GGA-PERSO

I used it in WSL

ngupta23 avatar Jan 12 '25 17:01 ngupta23

Windows subsystem for linux is not pure windows. ;) Multiprocessing works differently between Linux and Windows.

GGA-PERSO avatar Jan 12 '25 17:01 GGA-PERSO

image image

GGA-PERSO avatar Jan 12 '25 17:01 GGA-PERSO