qlib
qlib copied to clipboard
exists_skip and delete_old ignored
🐛 Bug Description
When running the following (note the last 2 args) command
python data_collector/yahoo/collector.py update_data_to_bin --qlib_data_1d_dir ~/.qlib/qlib_data/us_data --region us --interval 1d --version v2 --trading_date 2022-01-01 --end_date 2022-12-31 --exists_skip true --delete_old false
I get the following terminal output and prompt:
python scripts/data_collector/yahoo/collector.py update_data_to_bin --qlib_data_1d_dir ~/.qlib/qlib_data/us_data --region us --interval 1d --version v2 --trading_date 2022-01-01 --end_date 2022-12-31 --exists_skip true --delete_old false
2022-06-09 23:17:19.757 | WARNING | qlib.tests.data:_download_data:56 - The data for the example is collected from Yahoo Finance. Please be aware that the quality of the data might not be perfect. (You can refer to the original data source: https://finance.yahoo.com/lookup.)
2022-06-09 23:17:19.757 | INFO | qlib.tests.data:_download_data:59 - qlib_data_us_1d_latest.zip downloading......
450095104it [02:32, 2956744.87it/s]
2022-06-09 23:19:51.988 | WARNING | qlib.tests.data:_unzip:81 - will delete the old qlib data directory(features, instruments, calendars, features_cache, dataset_cache): /root/.qlib/qlib_data/us_data
Will be deleted:
['/root/.qlib/qlib_data/us_data/features', '/root/.qlib/qlib_data/us_data/calendars', '/root/.qlib/qlib_data/us_data/instruments']
If you do not need to delete /root/.qlib/qlib_data/us_data, please change the <--target_dir>
Are you sure you want to delete, yes(Y/y), no (N/n):
To Reproduce
Steps to reproduce the behavior:
- Run the above stated command
Expected Behavior
Screenshot
Environment
Note: User could run cd scripts && python collect_info.py all
under project directory to get system information
and paste them here directly.
python scripts/collect_info.py all
Linux
x86_64
Linux-5.15.0-35-generic-x86_64-with-glibc2.2.5
#36-Ubuntu SMP Sat May 21 02:24:07 UTC 2022
Python version: 3.8.0 (default, Nov 23 2019, 05:36:56) [GCC 8.3.0]
Qlib version: 0.8.5
numpy==1.22.4
pandas==1.4.2
scipy==1.8.1
requests==2.28.0
sacred==0.8.2
python-socketio==5.6.0
redis==4.3.3
python-redis-lock==3.7.0
schedule==1.1.0
cvxpy==1.2.1
hyperopt==0.1.2
fire==0.4.0
statsmodels==0.13.2
xlrd==2.0.1
plotly==5.8.1
matplotlib==3.5.2
tables==3.7.0
pyyaml==6.0
mlflow==1.26.1
tqdm==4.64.0
loguru==0.6.0
lightgbm==3.3.2
tornado==6.1
joblib==1.1.0
fire==0.4.0
ruamel.yaml==0.17.21
I think I have found the cause of this problem and if I add the delete_old
parameter when calling the qlib_data
method in this line of code, it will be solved. Would you like to be a contributor to this community by pull request to solve this issue.