tsfresh icon indicating copy to clipboard operation
tsfresh copied to clipboard

Feature extraction: Index error in numpy

Open adalli13 opened this issue 2 years ago • 2 comments

The problem: Hello, I have got a super small dataframe where I ran into an error with feature extraction

features = extract_features(df, column_id="index")

Traceback (most recent call last):
  File "c:\Users\alina.dallmann\.conda\envs\data_processing\lib\multiprocessing\pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\alina.dallmann\AppData\Roaming\Python\Python39\site-packages\tsfresh\utilities\distribution.py", line 43, in _function_with_partly_reduce
    results = list(itertools.chain.from_iterable(results))
  File "C:\Users\alina.dallmann\AppData\Roaming\Python\Python39\site-packages\tsfresh\utilities\distribution.py", line 42, in <genexpr>
    results = (map_function(chunk, **kwargs) for chunk in chunk_list)
  File "C:\Users\alina.dallmann\AppData\Roaming\Python\Python39\site-packages\tsfresh\feature_extraction\extraction.py", line 386, in _do_extraction_on_chunk
    return list(_f())
  File "C:\Users\alina.dallmann\AppData\Roaming\Python\Python39\site-packages\tsfresh\feature_extraction\extraction.py", line 364, in _f
    result = func(x, param=parameter_list)
  File "...\AppData\Roaming\Python\Python39\site-packages\tsfresh\feature_extraction\feature_calculators.py", line 2103, in friedrich_coefficients
    calculated[m][r] = _estimate_friedrich_coefficients(x, m, r)
  File "...\AppData\Roaming\Python\Python39\site-packages\tsfresh\feature_extraction\feature_calculators.py", line 152, in _estimate_friedrich_coefficients
    df["quantiles"] = pd.qcut(df.signal, r)
  File "...\.conda\envs\data_processing\lib\site-packages\pandas\core\reshape\tile.py", line 376, in qcut
    bins = np.quantile(x_np, quantiles)
  File "<__array_function__ internals>", line 180, in quantile
  File "...\.conda\envs\data_processing\lib\site-packages\numpy\lib\function_base.py", line 4371, in quantile
    return _quantile_unchecked(
  File "...\.conda\envs\data_processing\lib\site-packages\numpy\lib\function_base.py", line 4383, in _quantile_unchecked
    r, k = _ureduce(a,
  File "...\.conda\envs\data_processing\lib\site-packages\numpy\lib\function_base.py", line 3702, in _ureduce
    r = func(a, **kwargs)
  File "...\.conda\envs\data_processing\lib\site-packages\numpy\lib\function_base.py", line 4552, in _quantile_ureduce_func
    result = _quantile(arr,
  File "...\.conda\envs\data_processing\lib\site-packages\numpy\lib\function_base.py", line 4658, in _quantile
    take(arr, indices=-1, axis=DATA_AXIS)
  File "<__array_function__ internals>", line 180, in take
  File "...\.conda\envs\data_processing\lib\site-packages\numpy\core\fromnumeric.py", line 190, in take
    return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode)
  File "...\.conda\envs\data_processing\lib\site-packages\numpy\core\fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
IndexError: cannot do a non-empty take from an empty axes.
"""

The above exception was the direct cause of the following exception:

IndexError                                Traceback (most recent call last)
<ipython-input-200-81610c9844b8> in <module>
      2 from tsfresh.feature_extraction import EfficientFCParameters, MinimalFCParameters
...
---> 57         return bound(*args, **kwds)
     58     except TypeError:
     59         # A TypeError occurs if the object does have such a method in its

IndexError: cannot do a non-empty take from an empty axes.

Using the extract_features method with minimal fc parameters, it worked: features = extract_features(df, column_id="index", default_fc_parameters=MinimalFCParameters())

Given this error message from numpy, I have got no idea why this doesn't work exactly; I would appreciate if you give at least a hint in an error message :)

I saved the data in the csv file attached: example_df.csv

Environment:

  • Python version: 3.9.6
  • Operating System: Windows 10
  • tsfresh version: 0.19.0
  • Install method (conda, pip, source): conda
  • numpy: 1.22.4
  • pandas: 1.4.2

adalli13 avatar Aug 04 '22 08:08 adalli13

This could be an issue with your Python version. I was experiencing the same issue with Python 3.8. I downgraded to Python 3.7 and have been able to execute the same code successfully.

mdhanna avatar Aug 16 '22 01:08 mdhanna

I am also having this issue. I'm running Python 3.9.7 on a Mac M1 chip. I also confirm that it works when using MinimalFCParameters

davfarrugia avatar Aug 18 '22 08:08 davfarrugia

Same issue happens by following condition.

Python 3.8.13.
tsfresh 0.19.0 pypi_0 pypi

multiprocessing.pool.RemoteTraceback:

""" Traceback (most recent call last): File "/opt/conda/envs/py38/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/utilities/distribution.py", line 43, in _function_with_partly_reduce results = list(itertools.chain.from_iterable(results)) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/utilities/distribution.py", line 42, in results = (map_function(chunk, **kwargs) for chunk in chunk_list) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/extraction.py", line 386, in _do_extraction_on_chunk return list(_f()) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/extraction.py", line 364, in _f result = func(x, param=parameter_list) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/feature_calculators.py", line 2103, in friedrich_coefficients calculated[m][r] = _estimate_friedrich_coefficients(x, m, r) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/feature_calculators.py", line 152, in _estimate_friedrich_coefficients df["quantiles"] = pd.qcut(df.signal, r) File "/opt/conda/envs/py38/lib/python3.8/site-packages/pandas/core/reshape/tile.py", line 377, in qcut bins = np.quantile(x_np, quantiles) File "<array_function internals>", line 180, in quantile File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4412, in quantile return _quantile_unchecked( File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4424, in _quantile_unchecked r, k = _ureduce(a, File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3725, in _ureduce r = func(a, **kwargs) File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4593, in _quantile_ureduce_func result = _quantile(arr, File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4704, in _quantile previous = np.take(arr, previous_indexes, axis=DATA_AXIS) File "<array_function internals>", line 180, in take File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 190, in take return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode) File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc return bound(*args, **kwds) IndexError: cannot do a non-empty take from an empty axes. """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "5j_Model_lightGBM_v9.py", line 25, in X = extract_features(df_train_feature, File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/extraction.py", line 164, in extract_features result = _do_extraction( File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/extraction.py", line 294, in _do_extraction result = distributor.map_reduce( File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/utilities/distribution.py", line 241, in map_reduce result = list(itertools.chain.from_iterable(result)) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tqdm/std.py", line 1195, in iter for obj in iterable: File "/opt/conda/envs/py38/lib/python3.8/multiprocessing/pool.py", line 868, in next raise value File "/opt/conda/envs/py38/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/utilities/distribution.py", line 43, in _function_with_partly_reduce results = list(itertools.chain.from_iterable(results)) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/utilities/distribution.py", line 42, in results = (map_function(chunk, **kwargs) for chunk in chunk_list) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/extraction.py", line 386, in _do_extraction_on_chunk return list(_f()) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/extraction.py", line 364, in _f result = func(x, param=parameter_list) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/feature_calculators.py", line 2103, in friedrich_coefficients calculated[m][r] = _estimate_friedrich_coefficients(x, m, r) File "/opt/conda/envs/py38/lib/python3.8/site-packages/tsfresh/feature_extraction/feature_calculators.py", line 152, in _estimate_friedrich_coefficients df["quantiles"] = pd.qcut(df.signal, r) File "/opt/conda/envs/py38/lib/python3.8/site-packages/pandas/core/reshape/tile.py", line 377, in qcut bins = np.quantile(x_np, quantiles) File "<array_function internals>", line 180, in quantile File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4412, in quantile return _quantile_unchecked( File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4424, in _quantile_unchecked r, k = _ureduce(a, File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3725, in _ureduce r = func(a, **kwargs) File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4593, in _quantile_ureduce_func result = _quantile(arr, File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/lib/function_base.py", line 4704, in _quantile previous = np.take(arr, previous_indexes, axis=DATA_AXIS) File "<array_function internals>", line 180, in take File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 190, in take return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode) File "/opt/conda/envs/py38/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc return bound(*args, **kwds) IndexError: cannot do a non-empty take from an empty axes.

masa8 avatar Oct 17 '22 20:10 masa8

Hi @adalli13, @masa8. Thanks for bringing this problem to our attention. Which pandas version are you using? And would you be able to provide an example DataFrame to reproduce the problem?

kempa-liehr avatar Nov 17 '22 22:11 kempa-liehr

I used pandas 1.4.2 as described in my original request. Unfortunately, I do not have access anymore to my original code where I tried the feature so that I cannot provide an example data frame, sorry.

adalli13 avatar Nov 17 '22 22:11 adalli13

I wanted to use the library again in a different setting and encountered the same issue. I was also able to reproduce the same error with a very small example:

import random
import pandas as pd
dates = pd.date_range("2022-06-12", "2022-12-20")
values_1 = [random.randint(0, 200) for x in range(len(dates))]
test_df = pd.DataFrame(values_1, index=dates)
tsfresh.extract_features(test_df.reset_index(), column_id="index")

My current environment is:

  • Python 3.9.13 on Mac M1 (conda environment, installed using pip)
  • tsfresh=0.19.0
  • pandas= 1.5.2
  • numpy= 1.23.5

adalli13 avatar Dec 21 '22 10:12 adalli13

Thank you @adalli13 for the bug report and sorry for the delay! I can reproduce the issue. I am working on a solution.

nils-braun avatar Feb 28 '23 21:02 nils-braun

Sorry for the wrong information @adalli13 - the issue has been fixed in version 0.20.0 https://tsfresh.readthedocs.io/en/latest/changes.html#version-0-20-0

nils-braun avatar Mar 02 '23 21:03 nils-braun