ValueError: could not convert string to float: 'SH600000' when i use dump_bin.py
🐛 Bug Description
To Reproduce
Steps to reproduce the behavior:
-
python scripts/data_collector/baostock_5min/collector.py download_data --source_dir ~/.qlib/stock_data/source/hs300_5min_original --start 2022-01-01 --end 2022-01-30 --interval 5min --region HS300
-
python scripts/dump_bin.py dump_all --csv_path ~/.qlib/stock_data/source/hs300_5min_original --qlib_dir ~/.qlib/qlib_data/samples
Error:
"""
Traceback (most recent call last):
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
return [fn(*args) for args in chunk]
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/didi/Code/Github/qlib/scripts/dump_bin.py", line 508, in
Screenshot
Environment
Darwin arm64 macOS-14.2-arm64-arm-64bit Darwin Kernel Version 23.2.0: Wed Nov 15 21:54:55 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8122
Python version: 3.8.19 (default, Mar 20 2024, 15:27:52) [Clang 14.0.6 ]
Qlib version: 0.9.5 numpy==1.23.5 pandas==2.0.3 scipy==1.10.1 requests==2.32.3 sacred==0.8.6 python-socketio==5.11.4 redis==5.0.8 python-redis-lock==4.0.0 schedule==1.2.2 cvxpy==1.5.2 hyperopt==0.1.2 fire==0.6.0 statsmodels==0.14.1 xlrd==2.0.1 plotly==5.24.1 matplotlib==3.7.5 tables==3.7.0 pyyaml==6.0.2 mlflow==1.14.1 tqdm==4.66.5 loguru==0.7.2 lightgbm==4.5.0 tornado==6.4.1 joblib==1.4.2 fire==0.6.0 ruamel.yaml==0.17.36
Additional Notes
same issue
same issue
According to the information you provided, there is a problem with your operation steps, after download_data, you need to do the normalization_data first, and then you can use dump_bin to convert the data to bin format. The documentation is here.
Of course, before you do the normalization, you need to prepare a daily frequency data, the time of the daily frequency data should include the time of the normalized data.