qlib
qlib copied to clipboard
Cannot Run qrun with lightgbm workflow config
🐛 Bug Description
(qlib) 👍 examples main qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml [33800:MainThread](2022-01-03 15:22:20,670) INFO - qlib.Initialization - [config.py:391] - default_conf: client. [33800:MainThread](2022-01-03 15:22:20,671) WARNING - qlib.Initialization - [config.py:416] - redis connection failed(host=127.0.0.1 port=6379), DiskExpressionCache and DiskDatasetCache will not be used! [33800:MainThread](2022-01-03 15:22:20,672) INFO - qlib.Initialization - [init.py:68] - qlib successfully initialized based on client settings. [33800:MainThread](2022-01-03 15:22:20,672) INFO - qlib.Initialization - [init.py:70] - data_path={'__DEFAULT_FREQ': PosixPath('/Users/waterking/.qlib/qlib_data/cn_data')} [33800:MainThread](2022-01-03 15:22:20,673) INFO - qlib.workflow - [expm.py:282] - No tracking URI is provided. Use the default tracking URI. [33800:MainThread](2022-01-03 15:22:20,673) INFO - qlib.workflow - [expm.py:318] - <mlflow.tracking.client.MlflowClient object at 0x7fa97d0b8ac0> [33800:MainThread](2022-01-03 15:22:20,693) INFO - qlib.workflow - [exp.py:249] - Experiment 1 starts running ... [33800:MainThread](2022-01-03 15:22:20,796) INFO - qlib.workflow - [recorder.py:290] - Recorder a3ea360a53a84f0dbb7dae7b7e683dcc starts running under Experiment 1 ... /Users/waterking/opt/anaconda3/envs/qlib/lib/python3.8/site-packages/pyqlib-0.8.0.99-py3.8-macosx-10.9-x86_64.egg/qlib/utils/init.py:808: FutureWarning: MultiIndex.is_lexsorted is deprecated as a public function, users should use MultiIndex.is_monotonic_increasing instead. if idx.is_monotonic_increasing and not (isinstance(idx, pd.MultiIndex) and not idx.is_lexsorted()): [33800:MainThread](2022-01-03 15:23:21,552) INFO - qlib.timer - [log.py:113] - Time cost: 60.085s | Loading data Done [33800:MainThread](2022-01-03 15:23:22,967) INFO - qlib.timer - [log.py:113] - Time cost: 0.411s | DropnaLabel Done /Users/waterking/opt/anaconda3/envs/qlib/lib/python3.8/site-packages/pandas-1.3.5-py3.8-macosx-10.9-x86_64.egg/pandas/core/frame.py:3641: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self[k1] = value[k2] [33800:MainThread](2022-01-03 15:23:27,797) INFO - qlib.timer - [log.py:113] - Time cost: 4.829s | CSZScoreNorm Done [33800:MainThread](2022-01-03 15:23:27,835) INFO - qlib.timer - [log.py:113] - Time cost: 6.282s | fit & process data Done [33800:MainThread](2022-01-03 15:23:27,836) INFO - qlib.timer - [log.py:113] - Time cost: 66.369s | Init data Done [1] 33800 segmentation fault qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml /Users/waterking/opt/anaconda3/envs/qlib/lib/python3.8/site-packages/joblib-1.1.0-py3.8.egg/joblib/externals/loky/backend/resource_tracker.py:318: UserWarning: resource_tracker: There appear to be 2 leaked folder objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' /Users/waterking/opt/anaconda3/envs/qlib/lib/python3.8/site-packages/joblib-1.1.0-py3.8.egg/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/gr/17hy_zfj49l4p998g6xq2d340000gn/T/joblib_memmapping_folder_33800_95596af9eff64ff4b7116daedbbcbc53_a5376a7190bc40818f999d3d14e3d716: FileNotFoundError(2, 'No such file or directory') warnings.warn('resource_tracker: %s: %r' % (name, e)) /Users/waterking/opt/anaconda3/envs/qlib/lib/python3.8/site-packages/joblib-1.1.0-py3.8.egg/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /var/folders/gr/17hy_zfj49l4p998g6xq2d340000gn/T/joblib_memmapping_folder_33800_ca9c83dbe40b49aeacc099c618d44dfa_9a4c5e1eb21743a884e50325b7cd17fa: FileNotFoundError(2, 'No such file or directory') warnings.warn('resource_tracker: %s: %r' % (name, e))
To Reproduce
Steps to reproduce the behavior:
1.cd examples # Avoid running program under the directory contains qlib 2.qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
Environment
Note: User could run cd scripts && python collect_info.py all
under project directory to get system information
and paste them here directly.
- Qlib version: Newest
- Python version: 3.8.5
- OS (
MacOS
): - Commit number (optional, please provide it if you are using the dev version):
Additional Notes
Please help, thanks a lot!~
I tried to reproduce your problem, but did not succeed, please try again
@Waterkin Qlib uses the same approach for the automatic test and the test suceeds. https://github.com/microsoft/qlib/blob/main/.github/workflows/test.yml#L65 Could you please provide more details to reproduce this error?
For example, provide more details by running cd scripts && python collect_info.py all
I met the same problem when trying to qrun the lightgbm config, info as follows:
(qlib) root@fe7b424c885e:~/qlib/examples# qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
[7589:MainThread](2022-03-12 10:58:41,418) INFO - qlib.Initialization - [config.py:402] - default_conf: client.
[7589:MainThread](2022-03-12 10:58:41,424) INFO - qlib.Initialization - [__init__.py:73] - qlib successfully initialized based on client settings.
[7589:MainThread](2022-03-12 10:58:41,424) INFO - qlib.Initialization - [__init__.py:75] - data_path={'__DEFAULT_FREQ': PosixPath('/root/.qlib/qlib_data/cn_data')}
[7589:MainThread](2022-03-12 10:58:41,425) INFO - qlib.workflow - [expm.py:318] - <mlflow.tracking.client.MlflowClient object at 0x7f243f1692e0>
[7589:MainThread](2022-03-12 10:58:41,459) INFO - qlib.workflow - [exp.py:257] - Experiment 1 starts running ...
[7589:MainThread](2022-03-12 10:58:41,578) INFO - qlib.workflow - [recorder.py:293] - Recorder 9054b86afffe4ae29f2e992eb531c061 starts running under Experiment 1 ...
/root/anaconda3/envs/qlib/lib/python3.8/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
from pandas import MultiIndex, Int64Index
/root/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/utils/__init__.py:792: FutureWarning: MultiIndex.is_lexsorted is deprecated as a public function, users should use MultiIndex.is_monotonic_increasing instead.
if idx.is_monotonic_increasing and not (isinstance(idx, pd.MultiIndex) and not idx.is_lexsorted()):
[7589:MainThread](2022-03-12 11:00:51,940) INFO - qlib.timer - [log.py:113] - Time cost: 128.654s | Loading data Done
[7589:MainThread](2022-03-12 11:01:00,391) INFO - qlib.timer - [log.py:113] - Time cost: 0.572s | DropnaLabel Done
/root/anaconda3/envs/qlib/lib/python3.8/site-packages/qlib/data/dataset/processor.py:310: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[cols] = df[cols].groupby("datetime").apply(self.zscore_func)
[7589:MainThread](2022-03-12 11:01:13,439) INFO - qlib.timer - [log.py:113] - Time cost: 13.047s | CSZScoreNorm Done
[7589:MainThread](2022-03-12 11:01:13,440) INFO - qlib.timer - [log.py:113] - Time cost: 21.497s | fit & process data Done
[7589:MainThread](2022-03-12 11:01:13,441) INFO - qlib.timer - [log.py:113] - Time cost: 150.155s | Init data Done
Killed
(qlib) root@fe7b424c885e:~/qlib/examples# /root/anaconda3/envs/qlib/lib/python3.8/site-packages/joblib/externals/loky/backend/resource_tracker.py:318: UserWarning: resource_tracker: There appear to be 2 leaked folder objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
/root/anaconda3/envs/qlib/lib/python3.8/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /tmp/joblib_memmapping_folder_7589_f6d3f263998248d0b69f07daaa34f180_cc0445d7b4ae408cb69956a0c0168cf3: FileNotFoundError(2, 'No such file or directory')
warnings.warn('resource_tracker: %s: %r' % (name, e))
/root/anaconda3/envs/qlib/lib/python3.8/site-packages/joblib/externals/loky/backend/resource_tracker.py:333: UserWarning: resource_tracker: /tmp/joblib_memmapping_folder_7589_4994d8a48e124178a8d918fdd1a8d497_5b46fd15f6814361ade5610b909fbbb2: FileNotFoundError(2, 'No such file or directory')
warnings.warn('resource_tracker: %s: %r' % (name, e))
- And here is the result of
collect_info.py
:
Linux
x86_64
Linux-5.10.60.1-microsoft-standard-WSL2-x86_64-with-glibc2.10
#1 SMP Wed Aug 25 23:20:18 UTC 2021
Python version: 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:53:36) [GCC 9.4.0]
Qlib version: 0.8.4.99
numpy==1.22.3
pandas==1.4.1
scipy==1.8.0
requests==2.27.1
sacred==0.8.2
python-socketio==5.5.2
redis==4.1.4
python-redis-lock==3.7.0
schedule==1.1.0
cvxpy==1.2.0
hyperopt==0.1.2
fire==0.4.0
statsmodels==0.13.2
xlrd==2.0.1
plotly==5.6.0
matplotlib==3.5.1
tables==3.7.0
pyyaml==6.0
mlflow==1.24.0
tqdm==4.63.0
loguru==0.6.0
lightgbm==3.3.2
tornado==6.1
joblib==1.1.0
fire==0.4.0
ruamel.yaml==0.17.21
-
Since I run this tutorial in a container, maybe there is something to do with it, here is the base image I build and use with out qlib and the conda env above:
docker pull rolovoid/pytorch-cuda-conda:latest
-
Thanks a lot!
I also have the similar error here: https://github.com/microsoft/qlib/issues/1099
And I cannot pass the pytest either.