qlib icon indicating copy to clipboard operation
qlib copied to clipboard

numpy.datetime64 precision cause dict indexing failure in index_data.py

Open GeorgeGuo1202 opened this issue 1 year ago • 2 comments

🐛 Bug Description

ns precision numpy.datetime64 is equal to second precision however when doing indexing of a dict keys, it will cause failure

numpy.datetime64('2017-01-04T00:00:00.000000000')==numpy.datetime64('2017-01-04T00:00:00') Out[24]: True self.index_map[numpy.datetime64('2017-01-04T00:00:00.000000000')] Out[25]: 1 self.index_map[numpy.datetime64('2017-01-04T00:00:00')] Traceback (most recent call last): File "F:\work\env\py311\Lib\site-packages\IPython\core\interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in self.index_map[numpy.datetime64('2017-01-04T00:00:00')] ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: numpy.datetime64('2017-01-04T00:00:00')

it occurs at qlib/qlib/utils/index_data.py, line 157: try: return self.index_map[self._convert_type(item)] except IndexError as index_e: raise KeyError(f"{item} can't be found in {self}") from index_e

maybe I did something wrong, hoping for help

GeorgeGuo1202 avatar Jun 08 '24 10:06 GeorgeGuo1202

Can you describe, in detail, how this can be surfaced? This will help us to solve the problem.

SunsetWolf avatar Jun 13 '24 08:06 SunsetWolf

Here is a code reproduction based on the issue description. Although the two numpy.datetime64 values are equal, the difference in precision causes a dict indexing failure.

The following code successfully reproduces the issue on Python 3.9.19, pyqlib 0.9.5.99, and numpy 1.23.5.

import numpy as np
from qlib.utils.index_data import Index, SingleData

index = Index([np.datetime64('2017-01-04T00:00:00.000000000'),
               np.datetime64('2017-01-05T00:00:00.000000000'),
               np.datetime64('2017-01-06T00:00:00.000000000')])

data = SingleData([1, 2, 3], index=index)

# print: True
print(np.datetime64('2017-01-04T00:00:00.000000000') == np.datetime64('2017-01-04T00:00:00'))

# print: 0
print(data.index.index_map[np.datetime64('2017-01-04T00:00:00.000000000')])

# False
print(data.index.index_map[np.datetime64('2017-01-04T00:00:00')])

# print: 0
print(data.index.index_map[np.datetime64(np.datetime64('2017-01-04T00:00:00'), 'ns')])

akazeakari avatar Jun 17 '24 17:06 akazeakari