Error loading us data with yahoo collector.
🐛 Bug Description
Error loading us data with yahoo collector. It failed at the utils._get_eastmoney() method.
To Reproduce
python collector.py download_data --source_dir ~/.qlib/stock_data/source/us_data --start 2020-01-01 --end 2020-12-31 --delay 1 --interval 1d --region US
Root cause
It tries to use the following URL to fetch the stock catalog for us market http://4.push2delay.eastmoney.com/api/qt/clist/get?pn=1&pz=10000&fs=m:105,m:106,m:107&fields=f12
Though the page size is specified to 10000, it only returns max 100 items now. Then the following code just through request error.
https://github.com/microsoft/qlib/blob/ba8b6cc30f985065b7d5393888bb3f7a8937e861/scripts/data_collector/utils.py#L315C1-L316C46
if len(_symbols) < 8000:
raise ValueError("request error")
Environment
Note: User could run cd scripts && python collect_info.py all under project directory to get system information
and paste them here directly.
- Qlib version: latest master
- Python version: 3.12
- OS (
Windows,Linux,MacOS): Linux - Commit number (optional, please provide it if you are using the dev version):
Additional Notes
Hi, @nodew This is an issue we are actively working on, you can try pulling PR 1975 for a temporary fix.