csi300成份股csi300.txt部分数据有冲突
❓ Questions and Help
通过cn_index中collector.py获取csi300指数成份股2008年到2025年的数据,发现其中部分股票的年份有误,怎么才能拿到无误的数据呢?
I tried to generate csi300.txt with the command python collector.py --index_name CSI300 --qlib_dir ~/.qlib/qlib_data/cn_data --method parse_instruments, and I didn't find the problem you mentioned in it.
Thank you for your response. I tried the command again as mentioned above. Could you please take a look at the stock SH600023? There are data for two time periods: from 6/16/2014 to 6/12/2020, and from 1/1/2005 to 5/14/2025. It's evident that these two time periods overlap. The issue would be more apparent if the stocks were sorted.
I found this problem, would you like to fix it? Contributing your code is very welcome.
I found this problem, would you like to fix it? Contributing your code is very welcome.
I have identified the cause of the issue. When retrieving announcements on changes to constituent stocks, we searched using the title “Announcement on Adjustments to the Constituents of the CSI 300 and CSI Hong Kong 100 Indices”. However, after 2022, the official CSI website changed the naming convention of these announcements, resulting in missing data for the subsequent period. Consequently, the dataset we obtained was incomplete, leading to the observed issues. I apologize that I have only identified the cause of the problem and have not yet resolved it.