sktime
sktime copied to clipboard
[BUG] TSFreshFeatureExtractor RuntimeError outside of main
Describe the bug
When using the TSFreshFeatureExtractor transformer, the following error is continuously output to the terminal. After noticing this wasn't an issue with the tests or functions used to generate test results, I found that this does not occur when the code is run inside main. if __name__ == "__main__":
.
It is totally possible this is just me being Python illiterate, and its common sense that it should be run this way with a main, but will create an issue just in case :).
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 268, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "D:\CMP Machine Learning\sktime-workshop-boss\sktime\contrib\local_code.py", line 12, in <module>
tsfresh.fit_transform(X_train, y_train)
File "D:\CMP Machine Learning\sktime-workshop-boss\sktime\transformations\base.py", line 91, in fit_transform
return self.fit(Z, X).transform(Z)
File "D:\CMP Machine Learning\sktime-workshop-boss\sktime\transformations\panel\tsfresh.py", line 176, in transform
Xt = extract_features(
File "D:\CMP Machine Learning\sktime-workshop-boss\venv\lib\site-packages\tsfresh\feature_extraction\extraction.py", line 152, in extract_features
result = _do_extraction(df=timeseries_container,
File "D:\CMP Machine Learning\sktime-workshop-boss\venv\lib\site-packages\tsfresh\feature_extraction\extraction.py", line 240, in _do_extraction
distributor = MultiprocessingDistributor(n_workers=n_jobs,
File "D:\CMP Machine Learning\sktime-workshop-boss\venv\lib\site-packages\tsfresh\utilities\distribution.py", line 420, in __init__
self.pool = Pool(processes=n_workers, initializer=initialize_warnings_in_workers, initargs=(show_warnings,))
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 212, in __init__
self._repopulate_pool()
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 326, in _repopulate_pool_static
w.start()
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\Matthew Middlehurst\AppData\Local\Programs\Python\Python39\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Transformed data will eventually be output however.
dim_0__variance_larger_than_standard_deviation ... dim_0__matrix_profile__feature_"75"__threshold_0.98
0 0.0 ... 3.334879
1 0.0 ... 3.334879
2 0.0 ... 3.334879
3 0.0 ... 3.334879
4 0.0 ... 3.334879
.. ... ... ...
62 0.0 ... 3.334879
63 0.0 ... 3.334879
64 0.0 ... 3.334879
65 0.0 ... 3.334879
66 0.0 ... 3.334879
[67 rows x 787 columns]
To Reproduce
from sktime.datasets import load_italy_power_demand
from sktime.transformations.panel.tsfresh import TSFreshFeatureExtractor
X_train, y_train = load_italy_power_demand(split="train", return_X_y=True)
tsfresh = TSFreshFeatureExtractor(
default_fc_parameters="comprehensive",
show_warnings=False,
disable_progressbar=True,
)
t = tsfresh.fit_transform(X_train, y_train)
print(t)
Versions
This occurs on both my machines running Windows 10, one using Python 3.8 and the other 3.9.
System:
python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)]
executable: E:\_ProgramFiles\Anaconda3\envs\sktime-workshop-boss\python.exe
machine: Windows-10-10.0.19041-SP0
Python dependencies:
pip: 20.2.4
setuptools: 50.3.0.post20201006
sklearn: 0.24.2
sktime: 0.7.0
statsmodels: 0.12.1
numpy: 1.19.4
scipy: 1.5.3
Cython: 0.29.21
pandas: 1.1.4
matplotlib: 3.4.2
joblib: 0.17.0
numba: 0.51.2
pmdarima: None
tsfresh: 0.18.0
I've ran this on Macos and Linux without issue so maybe a windows thing if anyone can try it out?
Flashbacks to catch22... Hope the features aren't different between them.
I ran it on windows and recreated the bug. I tried it both with and without this os.environ["MKL_NUM_THREADS"] = "1" os.environ["NUMEXPR_NUM_THREADS"] = "1" os.environ["OMP_NUM_THREADS"] = "1" got the same bug each time
On: Windows 10, python 3.7, 3.8, 3.9, current 0.10.
- [x] Developer install: unable to reproduce
- [ ] Will do another check with "non developer setup".
After some testing, this appears to be fixed in tsfresh
0.19.0
. I don't think this issue is serious enough to change the version bounds for the dependency, however.