qlib icon indicating copy to clipboard operation
qlib copied to clipboard

[question] why not droping duplicates such as save_new_companies function?

Open keontang opened this issue 2 months ago • 1 comments

https://github.com/microsoft/qlib/blob/7d66e4b7882862ab3d72087ebfe9ca88e80b458a/scripts/data_collector/index.py#L228

shall we do it like this: new_df = new_df.drop_duplicates([self.SYMBOL_FIELD_NAME]) ????

keontang avatar Oct 10 '25 12:10 keontang

Hi, @keontang Thanks for your attention to qlib. parse_instruments() does not droping duplicates as expected. In its context, repeating symbols represent different validity intervals, which are necessary for the correctness of the historical data of the index components, rather than redundant data.

SunsetWolf avatar Oct 13 '25 07:10 SunsetWolf